Optistroke: Harnessing Bat Algorithm-Driven Stacked Ensembles for Enhanced Stroke Prediction Using Machine Learning
Keywords:
Stacked Ensemble, Gradient Descent, Prediction, Optimization, Machine Learning, Healthcare, StrokeAbstract
Strokes are considered to be one of the most serious medical conditions in the world and must be diagnosed at an early stage so that the consequences for the patients can be minimized. The proposed BatOptiStroke is a novel technique that increases stroke prediction accuracy by using a stacked ensemble model powered by the Bat Algorithm (BA). To capture a wide range of prediction skills, BatOptiStroke combines a diversified selection of base models, such as Extreme Gradient Boosting (XGBoost), K Nearest Neighbors (KNN), and Support Vector Machines (SVM). By modifying the placements and velocities of the bats that reflect the fundamental models, the BA continuously optimizes the group's efficiency. This results in increased stroke prediction accuracy. On a sizable dataset of stroke patients, the BatOptiStroke framework's efficiency is thoroughly assessed in comparison to that of each of the base models and alternative ensemble approaches. Evaluation metrics validate BatOptiStroke's stroke prediction capabilities. The combined model set consistently outperforms individual base models, improving prediction accuracy and overall performance. Along with increased 97% accuracy, 89% precision, 95% recall, and a 93% F1 score, BatOptiStroke also contributes.
Downloads
References
Yang, X. S. (2010). A new metaheuristic bat-inspired algorithm. In Nature inspired cooperative strategies for optimization (NICSO 2010) (pp. 65-74). Springer, Berlin, Heidelberg.
Yang, X. S. (2013). Bat algorithm: Literature review and applications. International Journal of Bio-Inspired Computation, 5(3), 141-149.
Chen, T., &Guestrin, C. (2016). Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785-794).
Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. Annals of statistics, 1189-1232.
Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1), 21-27.
Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: data mining, inference, and prediction. Springer Science & Business Media.
Cortes, C., &Vapnik, V. (1995). Support-vector networks. Machine learning, 20(3), 273-297.
Bishop, C. M. (2006). Pattern recognition and machine learning. Springer.
Wolpert, D. H. (1992). Stacked generalization. Neural networks, 5(2), 241-259.
Yang, X. S. (2010). A new metaheuristic bat-inspired algorithm. In Nature inspired cooperative strategies for optimization (NICSO 2010) (pp. 65-74). Springer, Berlin, Heidelberg.
Chen, T., &Guestrin, C. (2016). Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785-794).
Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1), 21-27.
Cortes, C., &Vapnik, V. (1995). Support-vector networks. Machine learning, 20(3), 273-297.
Breiman, L. (1996). Stacked regressions. Machine learning, 24(1), 49-64.
Li, Y., Cheng, K., You, J., & Chen, T. (2019). Improved particle swarm optimization based ensemble pruning algorithm for classification. Knowledge-Based Systems, 182, 104837.
Nguyen, T. V., Hwang, J. J., &Khosravi, A. (2019). Enhancing ensemble techniques for medical data classification. Expert Systems with Applications, 124, 1-21.
Singh, N., Mishra, S., Gadge, R., & Singh, S. (2021). A novel feature selection ensemble model for effective prediction of stroke using MRI images. Journal of Ambient Intelligence and Humanized Computing, 12(3), 3405-3416.
Zhang, C., Zhang, C., & Zhang, J. (2020). A novel stroke prediction model using Gaussian mixture model and long short-term memory. BMC Medical Informatics and Decision Making, 20(1), 1-15.
Uddin, S., & Mohd Noor, N. (2021). An improved stroke prediction model using ensemble learning with feature selection. Journal of Ambient Intelligence and Humanized Computing, 12(10), 8831-8842.
Aggarwal, A., Nalluri, J. K., &Nagabhushan, P. (2020). A machine learning-based stroke prediction system for comprehensive healthcare. International Journal of Computer Assisted Radiology and Surgery, 15(3), 497-506.
Yu, C. M., & Tseng, H. L. (2020). Predicting stroke risk using data mining techniques and Taiwan's National Health Insurance Research Database. International Journal of Medical Informatics, 140, 104168.
Wang, X., Jiang, J., Liu, Y., Liu, Y., & Wang, F. (2019). Efficient long short-term memory neural network model for stroke prediction. Healthcare Technology Letters, 6(2), 32-36.
Yang, X.S. and He, X., 2013. Bat algorithm: literature review and applications. International Journal of Bio-inspired computation, 5(3), pp.141-149.
Mirjalili, S., Mirjalili, S.M. and Yang, X.S., 2014. Binary bat algorithm. Neural Computing and Applications, 25, pp.663-681.
Fister Jr, I., Fister, D. and Yang, X.S., 2013. A hybrid bat algorithm. arXiv preprint arXiv:1303.6310.
Alageel, N., Alharbi, R., Alharbi, R., Alsayil, M. and Alharbi, L.A., 2023. Using Machine Learning Algorithm as a Method for Improving Stroke Prediction. International Journal of Advanced Computer Science and Applications, 14(4).
Ferdous, M.J. and Shahriyar, R., 2023, February. A Comparative Analysis for Stroke Risk Prediction Using Machine Learning Algorithms and Convolutional Neural Network Model. In 2023 International Conference on Electrical, Computer and Communication Engineering (ECCE) (pp. 1-6). IEEE.
Buyrukoğlu, S. and Savaş, S., 2023. Stacked-based ensemble machine learning model for positioning footballer. Arabian Journal for Science and Engineering, 48(2), pp.1371-1383.
Mushtaq, S. and Saini, K.S., 2023, March. A Review on Predicting Brain Stroke using Machine Learning. In 2023 10th International Conference on Computing for Sustainable Global Development (INDIACom) (pp. 667-673). IEEE.
Dang, L., Li, J., Bai, X., Liu, M., Li, N., Ren, K., Cao, J., Du, Q. and Sun, J., 2023. Novel Prediction Method Applied to Wound Age Estimation: Developing a Stacking Ensemble Model to Improve Predictive Performance Based on Multi-mRNA. Diagnostics, 13(3), p.395.
Dataset collection:https://www.kaggle.com/code/yassineboukhari/ml-project-heart-disease-ensembling-methods/notebook
Yacouby, R. and Axman, D., 2020, November. Probabilistic extension of precision, recall, and f1 score for more thorough evaluation of classification models. In Proceedings of the first workshop on evaluation and comparison of NLP systems (pp. 79-91).
Ma, Y., He, T., Tan, Y. and Jiang, X., 2020. Seq-BEL: sequence-based ensemble learning for predicting virus-human protein-protein interaction. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 19(3), pp.1322-1333.
Guo, G., Wang, H., Bell, D., Bi, Y. and Greer, K., 2003. KNN model-based approach in classification. In On The Move to Meaningful Internet Systems 2003: CoopIS, DOA, and ODBASE: OTM Confederated International Conferences, CoopIS, DOA, and ODBASE 2003, Catania, Sicily, Italy, November 3-7, 2003. Proceedings (pp. 986-996). Springer Berlin Heidelberg.
LaValley, M.P., 2008. Logistic regression. Circulation, 117(18), pp.2395-2399.
Chen, T., He, T., Benesty, M., Khotilovich, V., Tang, Y., Cho, H., Chen, K., Mitchell, R., Cano, I. and Zhou, T., 2015. Xgboost: extreme gradient boosting. R package version 0.4-2, 1(4), pp.1-4.
Rigatti, S.J., 2017. Random forest. Journal of Insurance Medicine, 47(1), pp.31-39.
Geurts, P., Ernst, D. and Wehenkel, L., 2006. Extremely randomized trees. Machine learning, 63, pp.3-42.
Wang, L. ed., 2005. Support vector machines: theory and applications (Vol. 177). Springer Science & Business Media.
De Ville, B., 2013. Decision trees. Wiley Interdisciplinary Reviews: Computational Statistics, 5(6), pp.448-455.
Leung, K.M., 2007. Naive bayesian classifier. Polytechnic University Department of Computer Science/Finance and Risk Engineering, 2007, pp.123-156.
Khosla, A., Cao, Y., Lin, C.C.Y., Chiu, H.K., Hu, J. and Lee, H., 2010, July. An integrated machine learning approach to stroke prediction. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 183-192).
Joshi, A., Choudhury, T., Sai Sabitha, A., Srujan Raju, K. (2020). Data Mining in Healthcare and Predicting Obesity. In: Raju, K., Govardhan, A., Rani, B., Sridevi, R., Murty, M. (eds) Proceedings of the Third International Conference on Computational Intelligence and Informatics . Advances in Intelligent Systems and Computing, vol 1090. Springer, Singapore. https://doi.org/10.1007/978-981-15-1480-7_82
Apat, S.K., Mishra, J., Srujan Raju, K., Padhy, N. (2023). State of the Art of Ensemble Learning Approach for Crop Prediction. In: Kumar, R., Pattnaik, P.K., R. S. Tavares, J.M. (eds) Next Generation of Internet of Things. Lecture Notes in Networks and Systems, vol 445. Springer, Singapore. https://doi.org/10.1007/978-981-19-1412-6_58
Khin, T., Srujan Raju, K., Sinha, G.R., Khaing, K.K., Kyi, T.M. (2020). Review of Optimization Methods of Medical Image Segmentation. In: Raju, K., Govardhan, A., Rani, B., Sridevi, R., Murty, M. (eds) Proceedings of the Third International Conference on Computational Intelligence and Informatics . Advances in Intelligent Systems and Computing, vol 1090. Springer, Singapore. https://doi.org/10.1007/978-981-15-1480-7_17
Downloads
Published
How to Cite
Issue
Section
License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.