Optimizing Diabetes Prediction: LDA Pre-processing & ANN Classification in Healthcare`


  • Soumya K N, Raja Praveen K N


Diabetes Mellitus (DM), Latent Dirichlet Allocation (LDA), Normalization, Pre-processing techniques, Classification accuracy.


Diabetes mellitus (DM) is a chronic disease that poses significant health risks if not well managed. The current healthcare system is overwhelmed by the impact of DM. Modern machine learning and deep learning methods have a hard time correctly predicting the stages of diabetes and often encounter decreased classification accuracy when dealing with massive datasets. In this work, we provide a novel approach to address these problems by integrating pre-processing with LDA and ANN for classification. By combining the LDA and ANN probability distribution functions by back propagation with initialized weights, our method enhances the accuracy of diabetes categorization. After pre-processing data from the PIMA and NCSU datasets using min-max normalization, bivariate filter-based feature selection is used to identify crucial characteristics. Pearson correlation is used to improve the feature set according to a threshold value, further refining the selected qualities. Our experimental results demonstrate the efficacy of the proposed approach, surpassing even the most cutting-edge methods. By integrating a robust classification model with advanced pre-processing techniques, our strategy produces encouraging outcomes in the accurate prediction of diabetes, which in turn helps to improve healthcare management methods.


Download data is not yet available.


Chang, V., Bailey, J., Xu, Q.A. and Sun, Z., 2022. Pima Indians diabetes mellitus classification based on machine learning (ML) algorithms. Neural Computing and Applications, pp.1-17.

aiswal, S. and Gupta, P., 2023. Diabetes Prediction Using Bi-directional Long Short-Term Memory. SN Computer Science, 4(4), p.373.

Yuan, Z., Ding, H., Chao, G., Song, M., Wang, L., Ding, W. and Chu, D., 2023. A Diabetes Prediction System Based on Incomplete Fused Data Sources. Machine Learning and Knowledge Extraction, 5(2), pp.384-399.

Kaul, S. and Kumar, Y., 2020. Artificial intelligence-based learning techniques for diabetes prediction: challenges and systematic review. SN Computer Science, 1(6), p.322.

Assegie, T.A. and Nair, P.S., 2020. The performance of different machine learning models on diabetes prediction. International journal of scientific & technology research, 9(01).

Cahn, A., Shoshan, A., Sagiv, T., Yesharim, R., Goshen, R., Shalev, V. and Raz, I., 2020. Prediction of progression from pre‐diabetes to diabetes: Development and validation of a machine learning model. Diabetes/metabolism research and reviews, 36(2), p.e3252.

Li, Y.-H.; Yeh, N.-N.; Chen, S.-J.; Chung, Y.-C. Computer-Assisted Diagnosis for Diabetic Retinopathy Based on Fundus Images Using Deep Convolutional Neural Network. Mob. Inf. Syst. 2019, 2019, 6142839. `

Nibareke, T. and Laassiri, J., 2020. Using Big Data-machine learning models for diabetes prediction and flight delays analytics. Journal of Big Data, 7, pp.1-18.`

Jaiswal, V., Negi, A. and Pal, T., 2021. A review on current advances in machine learning based diabetes prediction. Primary Care Diabetes, 15(3), pp.435-443.`

Annamalai, R. and Nedunchelian, R., 2021. Diabetes mellitus prediction and severity level estimation using OWDANN algorithm. Computational Intelligence and Neuroscience, 2021.`




How to Cite

Soumya K N, Raja Praveen K N. (2024). Optimizing Diabetes Prediction: LDA Pre-processing & ANN Classification in Healthcare`. International Journal of Intelligent Systems and Applications in Engineering, 12(21s), 729–739. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/5468



Research Article