Catalyzing Diabetes Prediction: Harnessing Machine Learning and Deep Learning for Optimization and Clustering


  • Monelli Ayyavaraiah


Diabetes Prediction, Early Diagnosis, Data Mining, Machine Learning, Healthcare Analytics, Patient Data Security


Diseases like diabetes mellitus are highly worrying since they kill so many people every year. High blood sugar levels are the root cause of this chronic condition. Untreated diabetes just adds extra complications to the lives of those who have it. Therefore, the mortality rate of humans may be lowered by the early prediction of diabetes. Diabetes may be better diagnosed using the data mining approach. Data mining methods for early prediction and illness detection described in a number of publications have varying degrees of accuracy. At the same time, data security is a major concern when mining information on diabetes. To address this problem, this paper develops a novel model for accurate early prediction of diabetes. In the first phase of the study, improved principal component analysis is investigated for its potential use in extracting useful features from the dataset. The machine learning approach proposes a Modified Support Vector Machine (MSVM) to diagnose diabetes at an early stage since it has the best accuracy of classification. Mining the patient's illness findings in the cloud is the key contribution of this study. The honey bee encryption and decryption algorithm is employed for this purpose. The accuracy, sensitivity, specificity, precision, and Negative Predictive Value (NPV) of the suggested approach are assessed using a number of different metrics. The collected results demonstrate the superiority of the suggested MSVM classifier, with an accuracy of 97.13%. The superior performance of the suggested approach has been shown by comparing it to the state of the art.


Download data is not yet available.


Aljawarneh, S & Yassein, MB 2017, 'A resource-efficient encryption algorithm for multimedia big data', Multimedia Tools and Applications, vol. 76, no. 21, pp. 22703-22724. [2]. Al-Sakran, HO 2015, 'Development of business analytics curricula to close skills gap for job demand in big data', Development, vol. 5, no. 3.

Al-Shaikhly, MH, El-Bakry, HM & Saleh, AA 2018, 'Cloud security using Markov chain and genetic algorithm', International Journal of Electronics and Information Engineering, vol. 8, no. 2, pp. 96-106.

Amani Yahyaoui, Akhtar Jamil, Jawad Rasheed & Mirsat Yesiltepel 2019, 'A Decision Support System for Diabetes Prediction Using Machine Learning and Deep Learning Techniques', 1st International Informatics and Software Engineering Conference.

Ashari, A, Paryudi, I & Tjoa, AM 2013, 'Performance comparison between Naïve Bayes, decision tree and k-nearest neighbor in searching alternative design in an energy simulation tool', International Journal of Advanced Computer Science and Applications (IJACSA), vol. 4, no. 11.

Behbahani, BA, Yazdi, FT, Shahidi, F, Mortazavi, SA & Mohebbi, M 2017, 'Principle component analysis (PCA) for investigation of relationship between population dynamics of microbial pathogenesis, chemical and sensory characteristics in beef slices containing Tarragon essential oil', Microbial pathogenesis, vol. 105, pp. 37-50.

Belguith, S, Kaaniche, N, Laurent, M, Jemai, A & Attia, R 2018, 'Phoabe: Securely outsourcing multi-authority attribute based encryption with policy hidden for cloud assisted iot', Computer Networks, vol. 133, pp. 141-156.

Beyene, C & Kamat, P 2018, 'Survey on prediction and analysis the occurrence of heart disease using data mining techniques', International Journal of Pure and Applied Mathematics, vol. 118, no. 8, pp. 165-174.

Cheng, X, Chen, F, Xie, D, Sun, H & Huang, C 2020, 'Design of a secure medical data sharing scheme based on blockchain', Journal of medical systems, vol. 44, no. 2, pp. 1-11.

Choi, C, Choi, J & Kim, P 2014, 'Ontology-based access control model for security policy reasoning in cloud computing', The Journal of Supercomputing, vol. 67, no. 3, pp. 711-722.

Cios, KJ & Moore, GW 2002, 'Uniqueness of medical data mining', Artificial intelligence in medicine, vol. 26, no. 1-2, pp. 1-24.

Deepa, N & Pandiaraja, P 2020, 'E health care data privacy preserving efficient file retrieval from the cloud service provider using attribute based file encryption', Journal of Ambient Intelligence and Humanized Computing, pp. 1-11.

Devi, MR 2016, 'Analysis of various data mining techniques to predict diabetes mellitus'.

Devi, TD, Subramani, A & Anitha, P 2020, 'Modified adaptive neuro fuzzy inference system based load balancing for virtual machine with security in cloud computing environment', Journal of Ambient Intelligence and Humanized Computing, pp. 1-8.

Dey, M & Rautaray, SS 2014, 'Study and analysis of data mining algorithms for healthcare decision support system', planning, vol. 5, no. 6.

Durairaj, M & Kalaiselvi, G 2015, 'Prediction of diabetes using soft computing techniques-A survey', International journal of scientific & technology research, vol. 4, no. 3, pp. 190-192.

Ephzibah, E 2011, 'Cost effective approach on feature selection using genetic algorithms and fuzzy logic for diabetes diagnosis', arXiv preprint arXiv:1103.0087.

Evirgen, H & Çerkezi, M 2014, 'Prediction and Diagnosis of Diabetic Retinopathy using Data Mining', The Online Journal of Science and Technology, vol. 12, p. 32.

Fan, K, Wang, S, Ren, Y, Li, H & Yang, Y 2018, 'Medblock: Efficient and secure medical data sharing via blockchain', Journal of medical systems, vol. 42, no. 8, pp. 1-11.

Fan, R-E, Chen, P-H, Lin, C-J & Joachims, T 2005, 'Working set selection using second order information for training support vector machines', Journal of machine learning research, vol. 6, no. 12.

Fayyad, U, Piatetsky-Shapiro, G & Smyth, P 1996, 'From data mining to knowledge discovery in databases', AI magazine, vol. 17, no. 3, pp. 37-54.

Ferrag, MA, Maglaras, L, Moschoyiannis, S & Janicke, H 2020, 'Deep learning for cyber security intrusion detection: Approaches, datasets, and comparative study', Journal of Information Security and Applications, vol. 50, p. 102419.

Galathiya, A, Ganatra, A & Bhensdadia, C 2012, 'Improved decision tree induction algorithm with feature selection, cross validation, model complexity and reduced error pruning', International Journal of Computer Science and Information Technologies, vol. 3, no. 2, pp. 3427-3431.

Ghosh, P, Biswas, S, Shakti, S & Phadikar, S 2020, 'An improved intrusion detection system to preserve security in cloud environment', International Journal of Information Security and Privacy (IJISP), vol. 14, no. 1, pp. 67-80.

Gupta, R, Kanungo, P & Dagdee, N 2020, 'HD-MAABE: Hierarchical Distributed Multi-Authority Attribute Based Encryption for Enabling Open Access', in International Conference on Intelligent Computing and Smart Communication 2019: Proceedings of ICSC 2019, p. 183.

Hepsiba, CL & Sathiaseelan, J 2016, 'Security issues in service models of cloud computing', International Journal of computer science and Mobile Computing, vol. 5, no. 3, pp. 610-615.

Hossin, M & Sulaiman, M 2015, 'A review on evaluation metrics for data classification evaluations', International Journal of Data Mining & Knowledge Management Process, vol. 5, no. 2, p. 1.

Hourali, M & Montazer, GA 2011, 'An intelligent information retrieval approach based on two degrees of uncertainty fuzzy ontology', Advances in Fuzzy Systems, vol. 2011.

Huang, F, Huang, J & Shi, Y-Q 2016, 'New framework for reversible data hiding in encrypted domain', IEEE transactions on information forensics and security, vol. 11, no. 12, pp. 2777-2789.

Iyer, A, Jeyalatha, S & Sumbaly, R 2015, 'Diagnosis of diabetes using classification mining techniques', arXiv preprint arXiv:1502.03774.




How to Cite

Monelli Ayyavaraiah. (2024). Catalyzing Diabetes Prediction: Harnessing Machine Learning and Deep Learning for Optimization and Clustering. International Journal of Intelligent Systems and Applications in Engineering, 12(21s), 3885 –. Retrieved from



Research Article