Diabetes Prediction Using Medical Data and Disease Influence Measures using Machine Learning


  • Avinash J. Agrawal Shri Ramdeobaba College of Engineering and Management, Nagpur, Maharashtra, India
  • Rashmi R. Welekar Shri Ramdeobaba College of Engineering and Management, Nagpur, Maharashtra, India
  • Namita Parati Maturi Venkata Subba Rao (MVSR) Engineering College, Hyderabad, Telangana State, India
  • Pravin R. Satav Government Polytechnic, Murtijapur, Maharashtra, India
  • Leena H. Patil Priyadarshini College of Engineering, Nagpur, Maharashtra, India
  • Shailendra S. Aote Shri Ramdeobaba College of Engineering and Management, Nagpur, Maharashtra, India


Support Vector Machine, Decision Trees, Random Forests, machine learning for diabetes prediction


This project intends to create a diabetes predictive model utilizing medical data and investigate the impact of various factors on the condition using machine learning techniques. Millions of individuals throughout the world suffer from the common chronic illness known as diabetes. The results of patients and public health initiatives can be considerably improved by early detection and an understanding of the underlying causes.To do this, a large dataset of medical records from people with and without diabetes that included a variety of demographic, lifestyle, and clinical factors was gathered. To pre-process the data and identify useful features, feature engineering approaches were used. Accurate prediction models for diabetes risk assessment were created using a variety of machine learning methods, such as Decision Trees, Random Forests, and Support Vector Machines.The main causes of the development of diabetes were also determined by looking into disease influence measures. This study intends to clarify the relative importance of several risk factors, such as age, BMI, family history, and glucose levels, by examining feature importance and correlation coefficients.In this paper various disease prediction methods were assessed and contrasted depending on how well they predicted diseases. The analysis' findings have been given in great detail to aid in the development process.


Download data is not yet available.


