Performance Analysis of Chronic Kidney Disease Detection Based on K-Nearest Neighbors Data Mining

Mohtady Ehab  Barakat; Chung Gwo  Chin; Lee It  Ee

Authors

Mohtady Ehab Barakat Faculty of Engineering, Multimedia University, 63100 Cyberjaya, Malaysia.
Chung Gwo Chin Faculty of Engineering, Multimedia University, 63100 Cyberjaya, Malaysia.
Lee It Ee Faculty of Engineering, Multimedia University, 63100 Cyberjaya, Malaysia.

Keywords:

Chronic kidney disease, data mining, K-Nearest Neighbors, linear regression, decision tree

Abstract

Kidney diseases are a leading cause of death in the United States. According to the Centers for Disease Control and Prevention (CDC), in 2021, approximately 37 million US adults, or 1 in 7, are estimated to have chronic kidney disease (CKD), and most are undiagnosed. Moreover, Medicare costs for people with CKD were $87.2 billion in 2019. Thus, data mining has been used in the healthcare industry to assist authorities in providing patients with health information as well as identifying patients earlier. In this paper, data mining is implemented for the classification of laboratory data from CKD patients. The K-Nearest Neighbors (KNN) algorithm is proposed to train the machine learning model to detect CKD based on blood test lab results such as sugar count, white blood cell count, red blood cell count, hemoglobin, albumin, etc. The model also includes general factors such as age and blood pressure. From the obtained results, other machine learning methods produce inferior accuracy, such as linear regression and decision tree. By training the model on a dataset containing 400 different anonymous patients using KNN, the accuracy reaches 99%. Based on the prediction, around 40% of the patients are fully healthy. This paper aims to detect whether the patient has CKD or not, depending on lab results and general information about the patient.

Downloads

Download data is not yet available.

References

T. K. Chen, D. H. Knicely, D. H. and M. E. Grams, “Chronic kidney disease diagnosis and management,” The Journal of the American Medical Association (JAMA), vol. 322, no. 13, pp. 1294, 2019. https://doi.org/ 10.1001/jama.2019.14745

C. P. Kovesdy, “Epidemiology of chronic kidney disease: an update 2022,” Kidney International Supplements, vol. 12, no. 1, pp. 7-11, 2022. https://doi.org/10.1016/j.kisu.2021.11.003

T. Calders and B. Custers, “What is data mining and how does it work?,” Studies in Applied Philosophy, Epistemology and Rational Ethics, pp. 27–42, 2013. https://doi.org/10.1007/978-3-642-30487-3_2

M. L. Kolling, L. B. Furstenau, M. K. Sott, B. Rabaioli, OP. H. Ulmi, N. L. Bragazzi and L. P. Tedesco, “Data mining in healthcare: Applying strategic intelligence techniques to depict 25 years of research development,” International Journal of Environmental Research and Public Health, vol. 18, no. 6, pp. 3099, 2021. https://doi.org/10.3390/ijerph18063099

A. Garg and V. Mago, “Role of machine learning in medical research: A survey,” Computer Science Review, vol. 40, pp. 100370, 2021. https://doi.org/10.1016/j.cosrev.2021.100370

P. Sinha and P. Sinha, “Comparative study of chronic kidney disease prediction using KNN and SVM,” International Journal of Engineering Research and Technology (IJERT), vol. 4, no. 12, 2015. https://doi.org/ 10.17577/IJERTV4IS120622

P. Tikariha and P. Richhariya, “Comparative study of chronic kidney disease prediction using different classification techniques,” presented at the Proceedings of International Conference on Recent Advancement on Computer and Communication (ICRAC), pp. 195-203), Springer Singapore, 2018. https://doi.org/10.1007/978-981-10-8198-9_20

E.-H. A. Rady and A. S. Anwar, “Prediction of kidney disease stages using data mining algorithms,” Informatics in Medicine Unlocked, vol. 15, pp. 100178, 2019. https://doi.org/ 10.1016/j.imu.2019.100178

A. AhmedK, S. Aljahdali and S. Naimatullah Hussain, “Comparative prediction performance with support vector machine and random forest classification techniques,” International Journal of Computer Applications, vol. 69, no. 11, pp. 12–16, 2013. https://doi.org/10.5120/11885-7922

R. Subhashini, M. Jeyakumar and N. Islam, “OF-KNN technique: An approach for chronic kidney disease prediction,” Computer Science, vol. 116, no. 24, 2017.

V. Manoranjithem and M. Venkatesulu, “KNN classification in chronic kidney disease dataset, International Journal of Mathematics and Computer Science (IJMCS), vol. 15, no. 4, pp. 1337–1343, 2020.

C. Priyadharshini, K. Sanjeev, M. Vignesh, N. Saravanan and M. Somu, “KNN based detection and diagnosis of chronic kidney disease,” Annals of the Romanian Society for Cell Biology, vol. 25, no. 4, pp. 2870, 2021.

S. Suthaharan, Machine Learning Models and Algorithms for Big Data Classification: Thinking with Examples for Effective Learning, Springer, 2015.

A. Schneider, G. Hommel and M. Blettner, “Linear regression analysis,” Deutsches Ärzteblatt International, vol. 107, no. 44, pp. 776-782, 2010. https://doi.org/10.3238/arztebl.2010.0776

I. Jenhani, N. B. Amor and Z. Elouedi, “Decision trees as possibilistic classifiers,” International Journal of Approximate Reasoning, vol. 48, no. 3, pp. 784–807, 2008. https://doi.org/10.1016/j.ijar.2007.12.002

Z. Zhang, “Introduction to machine learning: K-Nearest Neighbors,” Annals of Translational Medicine, vol. 4, no. 11, pp. 218–218, 2016. https://doi.org/10.21037/atm.2016.03.37

H. Rajaguru and S. K. Prabhakar, “KNN Classifier and K-Means Clustering for Robust Classification of Epilepsy from EEG Signals. A Detailed Analysis,” Anchor Academic Publishing, 2017.

Y. Jung and J. Hu, “A K-fold averaging cross-validation procedure,” Journal of Nonparametric Statistics, vol. 27, no. 2, pp. 167–179, 2015. https://doi.org/10.1080/10485252.2015.1010532

R. Blagus and L. Lusa, “SMOTE for high-dimensional class-imbalanced data,” BMC Bioinformatics 14, no. 106, 2013. https://doi.org/ 10.1186/1471-2105-14-106

G. S. K. G. Prasad, A. A. Chowdari, K. P. Jona and R. Senapati, “Detection of CKD from CT Scan images using KNN algorithm and using edge detection,” presented at the 2022 2nd International Conference on Emerging Frontiers in Electrical and Electronic Technologies (ICEFEET), pp. 1-4, 2022. https://doi.org/10.1109/icefeet51821.2022.9848173

M, T. ., & K, P. . (2023). An Enhanced Expectation Maximization Text Document Clustering Algorithm for E-Content Analysis. International Journal on Recent and Innovation Trends in Computing and Communication, 11(1), 12–19. https://doi.org/10.17762/ijritcc.v11i1.5982

Dr. Bhushan Bandre. (2013). Design and Analysis of Low Power Energy Efficient Braun Multiplier. International Journal of New Practices in Management and Engineering, 2(01), 08 - 16. Retrieved from http://ijnpme.org/index.php/IJNPME/article/view/12

Performance Analysis of Chronic Kidney Disease Detection Based on K-Nearest Neighbors Data Mining

Authors

Keywords:

Abstract

Downloads

References

Downloads

Published

How to Cite

Issue

Section

License

Most read articles by the same author(s)

Announcements

Information for Authors

ijisae

Information

Indexed By

Performance Analysis of Chronic Kidney Disease Detection Based on K-Nearest Neighbors Data Mining

Authors

Keywords:

Abstract

Downloads

References

Downloads

Published

How to Cite

Issue

Section

License

Most read articles by the same author(s)

Announcements

Information for Authors

Like, Subscribe and Share This Video

ijisae

Information

Indexed By