An Ensemble Deep Learning Model for Diabetes Disease Prediction
Keywords:
Deep Learning, Diabetes, Ensemble Learning, Healthcare, LSTMAbstract
Diabetes remains a significant health challenge with serious consequences if left undiagnosed or untreated. Addressing the issues of accurately labeled data, outliers, small number of samples and missing information in clinical datasets is crucial for effective diabetes prediction. Despite various efforts, there is still room for improvement in the accuracy of machine and deep learning methods for early diabetes detection. In this study, we propose a novel approach that integrates three proven deep learning models—Long Short Term Memory (LSTM), Deep Neural Networks (DNN), and Convolutional Neural Networks (CNN)—using a soft voting classifier to enhance predictive performance. Additionally, we employ data fusion to effectively address the challenge of small datasets. Our model demonstrated impressive accuracy rates when evaluated on the Pima Indian Diabetes Dataset (PIDD), the Frankfurt Hospital Germany Diabetes Dataset (FHGDD), and a combined dataset: 85.9% on PIDD, 98.0% on FHGDD, and 99.81% on the combined dataset. These results outperform those of individual classifiers, highlighting the effectiveness of our method in diabetes prediction.
Downloads
References
J. Sreedharan, J. Muttappallymyalil, S. Al Sharbatti, S. Hassoun, R. Safadi, I. Abderahman, W. A. Hameed, A. M. Ibrahim, M. T. Takana and A. M. Fouda, "Incidence of type 2 diabetes mellitus among Emirati residents in Ajman, United Arab Emirates," Korean Journal of Family Medicine, vol. 36, p. 253, 2015.
J. F. Ndisang, A. Vannacci and S. Rastogi, Insulin resistance, type 1 and type 2 diabetes, and related complications, vol. 2017, Hindawi, 2017.
T. Sharma and M. Shah, "A comprehensive review of machine learning techniques on diabetes detection," Visual Computing for Industry, Biomedicine, and Art, vol. 4, p. 1–16, 2021.
B. F. Wee, S. Sivakumar, K. H. Lim, W. K. Wong and F. H. Juwono, "Diabetes detection based on machine learning and deep learning approaches," Multimedia Tools and Applications, vol. 83, no. 8, pp. 24153-24185, 2024.
M. M. Najafabadi, F. Villanustre, T. M. Khoshgoftaar, N. Seliya, R. Wald and E. Muharemagic, "Deep learning applications and challenges in big data analytics," Journal of big data, vol. 2, p. 1–21, 2015.
T. Zhu, K. Li, P. Herrero and P. Georgiou, "Deep learning for diabetes: a systematic review," IEEE Journal of Biomedical and Health Informatics, vol. 25, p. 2744–2757, 2020.
M. W. Nadeem, H. G. Goh, V. Ponnusamy, I. Andonovic, M. A. Khan and M. Hussain, "A fusion-based machine learning approach for the prediction of the onset of diabetes," in Healthcare, 2021.
Sathurthi, S., "An analysis of parallel ensemble diabetes decision support system based on voting classifier for classification problem," Electronic Government, an International Journal, vol. 16, p. 25–38, 2020.
Doğru, S. Buyrukoğlu and M. Arı, "A hybrid super ensemble learning model for the early-stage prediction of diabetes risk," Medical & Biological Engineering & Computing, p. 1–13, 2023.
L. Ismail, H. Materwala, M. Tayefi, P. Ngo and A. P. Karduck, "Type 2 diabetes with artificial intelligence machine learning: methods and evaluation," Archives of Computational Methods in Engineering, p. 1–21, 2021.
S. Kumari, D. Kumar and M. Mittal, "An ensemble approach for classification and prediction of diabetes mellitus using soft voting classifier," International Journal of Cognitive Computing in Engineering, vol. 2, p. 40–46, 2021.
Mahabub, "A robust voting approach for diabetes prediction using traditional machine learning techniques," SN Applied Sciences, vol. 1, p. 1667, 2019.
S. Bashir, U. Qamar, F. H. Khan and M. Y. Javed, "An efficient rule-based classification of Diabetes using ID3, C4. 5, & CART ensembles," in 2014 12th International Conference on Frontiers of Information Technology, 2014.
M. Alghamdi, M. Al-Mallah, S. Keteyian, C. Brawner, J. Ehrman and S. Sakr, "Predicting diabetes mellitus using SMOTE and ensemble machine learning approach: The Henry Ford ExercIse Testing (FIT) project," PloS one, vol. 12, p. e0179805, 2017.
H. B. Kibria, M. Nahiduzzaman, M. O. F. Goni, M. Ahsan and J. Haider, "An ensemble approach for the prediction of diabetes mellitus using a soft voting classifier with an explainable AI," Sensors, vol. 22, p. 7268, 2022.
M. Alehegn, R. Joshi and M. Alehegn, "Analysis and prediction of diabetes diseases using machine learning algorithm: Ensemble approach," International Research Journal of Engineering and Technology, vol. 4, p. 426–436, 2017.
H. Naz and S. Ahuja, "Deep learning approach for diabetes prediction using PIMA Indian dataset," Journal of Diabetes & Metabolic Disorders, vol. 19, p. 391–403, 2020.
E. P. Prakash, S. Karthik, M. V. Kamal, B. Reddy S, M. A. Mukunthan, K. Sahile and others, "Implementation of Artificial Neural Network to Predict Diabetes with High-Quality Health System," Computational Intelligence and Neuroscience, vol. 2022, 2022.
Z. Mushtaq, M. F. Ramzan, S. Ali, S. Baseer, A. Samad and M. Husnain, "Voting classification-based diabetes mellitus prediction using hypertuned machine-learning techniques," Mobile Information Systems, vol. 2022, p. 1–16, 2022.
S. Ramesh, H. Balaji, N. C. S. N. Iyengar and R. D. Caytiles, "Optimal predictive analytics of pima diabetics using deep learning," International Journal of Database Theory and Application, vol. 10, p. 47–62, 2017.
Yahyaoui, A. Jamil, J. Rasheed and M. Yesiltepe, "A decision support system for diabetes prediction using machine learning and deep learning techniques," in 2019 1st International informatics and software engineering conference (UBMYK), 2019.
K. Kannadasan, D. R. Edla and V. Kuppili, "Type 2 diabetes data classification using stacked autoencoders in deep neural networks," Clinical Epidemiology and Global Health, vol. 7, p. 530–535, 2019.
S. K. Kalagotla, S. V. Gangashetty and K. Giridhar, "A novel stacking technique for prediction of diabetes," Computers in Biology and Medicine, vol. 135, p. 104554, 2021.
M. T. Garcı́a-Ordás, C. Benavides, J. A. Benı́tez-Andrades, H. Alaiz-Moretón and I. Garcı́a-Rodrı́guez, "Diabetes detection using deep learning techniques with oversampling and feature augmentation," Computer Methods and Programs in Biomedicine, vol. 202, p. 105968, 2021.
G. Swapna, R. Vinayakumar and K. P. Soman, "Diabetes detection using deep learning algorithms," ICT express, vol. 4, p. 243–246, 2018.
B. Ihnaini, M. A. Khan, T. A. Khan, S. Abbas, M. S. Daoud, M. Ahmad and M. A. Khan, "A smart healthcare recommendation system for multidisciplinary diabetes patients with data fusion based on deep ensemble learning," Computational Intelligence and Neuroscience, vol. 2021, 2021.
Y. Singh and M. Tiwari, "A Hybrid Approach for Prediction of Type 2 Diabetes Using Birch Clustering and Artificial Neural Network," Intelligent Systems and Smart Infrastructure: Proceedings of ICISSI 2022, pp. 421-433, 2023.
S. Ramesh, R. D. Caytiles and N. C. S. Iyengar, "A deep learning approach to identify diabetes," Advanced Science and Technology Letters, vol. 145, p. 44–49, 2017.
H. Qi, X. Song, S. Liu, Y. Zhang and K. K. L. Wong, "KFPredict: An ensemble learning prediction framework for diabetes based on fusion of key features.," Computer methods and programs in biomedicine, vol. 231, p. 107378, April 2023.
R. Kamalraj, S. Neelakandan, M. R. Kumar, V. C. S. Rao, R. Anand and H. Singh, "Interpretable filter based convolutional neural network (IF-CNN) for glucose prediction and classification using PD-SS algorithm," Measurement, vol. 183, p. 109804, 2021.
M. Rahman, D. Islam, R. J. Mukti and I. Saha, "A deep learning approach based on convolutional LSTM for detecting diabetes," Computational biology and chemistry, vol. 88, p. 107329, 2020.
A. P. Ratna, P. D. Purnamasari, N. K. Anandra and D. L. Luhurkinanti, "Hybrid Deep Learning CNN-Bidirectional LSTM and Manhattan Distance for Japanese Automated Short Answer Grading: Use case in Japanese Language Studies," in 2022 the 8th International Conference on Communication and Information Processing, 2022.
U. M. Learning, "Pima Indians Diabetes Database," October 2016. [Online]. Available: https://www.kaggle.com/datasets/uciml/pima-indians-diabetes-database. [Accessed 17 4 2023].
J. Dasilva, "Diabetes Dataset," 2018. [Online]. Available: https://www.kaggle.com/datasets/johndasilva/diabetes. [Accessed 17 4 2023].
S. Pal, N. Mishra, M. Bhushan, P. S. Kholiya, M. Rana and A. Negi, "Deep learning techniques for prediction and diagnosis of diabetes mellitus," in 2022 International Mobile and Embedded Technology Conference (MECON), 2022.
R. Lohiya and A. Thakkar, "Intrusion detection using deep neural network with antirectifier layer," in Applied Soft Computing and Communication Networks: Proceedings of ACN 2020, 2021.
J. Gu, Z. Wang, J. Kuen, L. Ma, A. Shahroudy, B. Shuai, T. Liu, X. Wang, G. Wang, J. Cai and others, "Recent advances in convolutional neural networks," Pattern recognition, vol. 77, p. 354–377, 2018.
M. Jogin, M. S. Madhulika, G. D. Divya, R. K. Meghana, S. Apoorva and others, "Feature extraction using convolution neural networks (CNN) and deep learning," in 2018 3rd IEEE international conference on recent trends in electronics, information & communication technology (RTEICT), 2018.
R. Pascanu, C. Gulcehre, K. Cho and Y. Bengio, "How to construct deep recurrent neural networks," arXiv preprint arXiv:1312.6026, 2013.
M. F. Aslan and K. Sabanci, "A Novel Proposal for Deep Learning-Based Diabetes Prediction: Converting Clinical Data to Image Data," Diagnostics, vol. 13, p. 796, 2023.
N. B. Yahia, M. D. Kandara and N. B. B. Saoud, "Deep ensemble learning method to forecast COVID-19 outbreak," ResearchSquare: Durham, NC, USA, 2020.
Rahman, "A Deep Ensemble Model for News Classification on Social Media," in 2021 International Conference on Cyber Warfare and Security (ICCWS), 2021.
T. Beghriche, M. Djerioui, Y. Brik, B. Attallah and S. B. Belhaouari, "An efficient prediction system for diabetes disease based on deep neural network," Complexity, vol. 2021, p. 1–14, 2021.
Singh, A. Dhillon, N. Kumar, M. S. Hossain, G. Muhammad and M. Kumar, "eDiaPredict: an ensemble-based framework for diabetes prediction," ACM Transactions on Multimidia Computing Communications and Applications, vol. 17, p. 1–26, 2021.
S. Larabi-Marie-Sainte, L. Aburahmah, R. Almohaini and T. Saba, "Current techniques for diabetes prediction: review and case study," Applied Sciences, vol. 9, p. 4604, 2019.
S. R. Sannasi Chakravarthy and H. Rajaguru, "Ensemble-Based Weighted Voting Approach for the Early Diagnosis of Diabetes Mellitus," in Sustainable Communication Networks and Application: Proceedings of ICSCN 2021, Springer, 2022, p. 451–460.
J. Cao, S. Kwong, R. Wang, X. Li, K. Li and X. Kong, "Class-specific soft voting based multiple extreme learning machines ensemble," Neurocomputing, vol. 149, p. 275–284, 2015.
H. T. X. Doan and G. M. Foody, "Increasing soft classification accuracy through the use of an ensemble of classifiers," International Journal of Remote Sensing, vol. 28, p. 4609–4623, 2007.
P. B. K. Chowdary and R. U. Kumar, "An Effective Approach for Detecting Diabetes using Deep Learning Techniques based on Convolutional LSTM Networks," International Journal of Advanced Computer Science and Applications, vol. 12, 2021.
N. Rai, N. Kaushik, D. Kumar, C. Raj and A. Ali, "Mortality prediction of COVID-19 patients using soft voting classifier," International Journal of Cognitive Computing in Engineering, vol. 3, p. 172–179, 2022.
Azbeg, "Diabetes emergency cases identification based on a statistical predictive model," Journal of Big Data, vol. 9, p. 1–25, 2022.
R. Saxena and others, "Role of K-nearest neighbour in detection of Diabetes Mellitus," Turkish Journal of Computer and Mathematics Education (TURCOMAT), vol. 12, p. 373–376, 2021.
M. Ali, M. N. Haider, S. A. Lashari, W. Sharif, A. Khan and D. A. Ramli, "Stacking Classifier with Random Forest functioning as a Meta Classifier for Diabetes Diseases Classification," Procedia Computer Science, vol. 207, p. 3459–3468, 2022.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.