An Evaluation of Machine Learning Algorithms and Feature Selection Methods for Cervical Cancer Risk Prediction using Clinical Features
Keywords:
Cervical Cancer, Decision Tree, Feature Selection, RFE, SMOTE, Supervised Machine LearningAbstract
Cervical cancer is one of the most frequent gynecological cancers worldwide. It is associated to several risk factors like sexually transmitted diseases, human papillomavirus and smoking. The early diagnosis of this disease is crucial to lower fatality rates. Furthermore, its early prediction can support clinicians and patients to have an effective treatment. This study intends to compare machine learning classifiers to determine the best model to predict cervical cancer and identify its most significant risk factors. This work compares five machine learning algorithms: K-Nearest Neighbor, Gaussian Naïve Bayes, Logistic Regression, Random Forest and Decision Tree (DT). Afterwards, the study continues to enhance the outcome of DT algorithm through balancing the data with Synthetic Minority Oversampling Technique (SMOTE), selecting the most important features with Recursive Feature Elimination (RFE) and tuning hyperparameters with Grid Search technique. Overall, the combination of Decision Tree classification technique with SMOTE and tuning hyperparameters with Grid Search method presents the most performing model.
Downloads
References
F. Bray, J. Ferlay, I. Soerjomataram, R. L. Siegel, L. A. Torre, et A. Jemal, « Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries », CA. Cancer J. Clin., vol. 68, no 6, p. 394 424, nov. 2018, doi: 10.3322/caac.21492.
Yadav, P. ., S. . Kumar, and D. K. J. . Saini. “A Novel Method of Butterfly Optimization Algorithm for Load Balancing in Cloud Computing”. International Journal on Recent and Innovation Trends in Computing and Communication, vol. 10, no. 8, Aug. 2022, pp. 110-5, doi:10.17762/ijritcc.v10i8.5683.
« Cervical cancer ». https://www.who.int/westernpacific/health-topics/cervical-cancer (consulté le 11 avril 2021).
W. Messoudi et al., « Cervical cancer prevention in Morocco: a model-based cost-effectiveness analysis », J. Med. Econ., vol. 22, no 11, p. 1153 1159, nov. 2019, doi: 10.1080/13696998.2019.1624556.
M. Exner et al., « Value of diffusion-weighted MRI in diagnosis of uterine cervical cancer: a prospective study evaluating the benefits of DWI compared to conventional MR sequences in a 3T environment », Acta Radiol., vol. 57, no 7, p. 869 877, juill. 2016, doi: 10.1177/0284185115602146.
P. Z. McVeigh, A. M. Syed, M. Milosevic, A. Fyles, et M. A. Haider, « Diffusion-weighted MRI in cervical cancer », Eur. Radiol., vol. 18, no 5, p. 1058 1064, mai 2008, doi: 10.1007/s00330-007-0843-3.
W. Wu et H. Zhou, « Data-Driven Diagnosis of Cervical Cancer With Support Vector Machine-Based Approaches », IEEE Access, vol. 5, p. 25189 25195, 2017, doi: 10.1109/ACCESS.2017.2763984.
A. Gadducci, C. Barsotti, S. Cosio, L. Domenici, et A. Riccardo Genazzani, « Smoking habit, immune suppression, oral contraceptive use, and hormone replacement therapy use and cervical carcinogenesis: a review of the literature », Gynecol. Endocrinol., vol. 27, no 8, p. 597 604, août 2011, doi: 10.3109/09513590.2011.558953.
P. Luhn et al., « The role of co-factors in the progression from human papillomavirus infection to cervical cancer », Gynecol. Oncol., vol. 128, no 2, p. 265 270, févr. 2013, doi: 10.1016/j.ygyno.2012.11.003.
Harsh, S. ., Singh , D., & Pathak , S. (2022). Efficient and Cost-effective Drone – NDVI system for Precision Farming. International Journal of New Practices in Management and Engineering, 10(04), 14–19. https://doi.org/10.17762/ijnpme.v10i04.126
« Cervical Cancer Prevention (PDQ®)–Health Professional Version - National Cancer Institute », 26 mars 2021. https://www.cancer.gov/types/cervical/hp/cervical-prevention-pdq (consulté le 11 avril 2021).
S. Ganguly et al., « An Adaptive Threshold Based Algorithm for Detection of Red Lesions of Diabetic Retinopathy in a Fundus Image », p. 4, 2014.
A. Agarwal, S. Gulia, S. Chaudhary, M. K. Dutta, C. M. Travieso, et J. B. Alonso-Hernandez, « A novel approach to detect glaucoma in retinal fundus images using cup-disk and rim-disk ratio », in 2015 4th International Work Conference on Bioinspired Intelligence (IWOBI), San Sebastian, Spain, juin 2015, p. 139 144. doi: 10.1109/IWOBI.2015.7160157.
S. S. Han et al., « Keratinocytic Skin Cancer Detection on the Face Using Region-Based Convolutional Neural Network », JAMA Dermatol., vol. 156, no 1, p. 29, janv. 2020, doi: 10.1001/jamadermatol.2019.3807.
S. Graham et al., « Artificial Intelligence for Mental Health and Mental Illnesses: an Overview », Curr. Psychiatry Rep., vol. 21, no 11, p. 116, nov. 2019, doi: 10.1007/s11920-019-1094-0.
F. Christopoulou, T. T. Tran, S. K. Sahu, M. Miwa, et S. Ananiadou, « Adverse drug events and medication relation extraction in electronic health records with ensemble deep learning methods », J. Am. Med. Inform. Assoc., vol. 27, no 1, p. 39 46, janv. 2020, doi: 10.1093/jamia/ocz101.
A. Singh, M. K. Dutta, R. Jennane, et E. Lespessailles, « Classification of the trabecular bone structure of osteoporotic patients using machine vision », Comput. Biol. Med., vol. 91, p. 148 158, déc. 2017, doi: 10.1016/j.compbiomed.2017.10.011.
M. Kolařík, R. Burget, V. Uher, K. Říha, et M. Dutta, « Optimized High Resolution 3D Dense-U-Net Network for Brain and Spine Segmentation », Appl. Sci., vol. 9, no 3, p. 404, janv. 2019, doi: 10.3390/app9030404.
K. Fernandes, J. S. Cardoso, et J. Fernandes, « Transfer Learning with Partial Observability Applied to Cervical Cancer Screening », p. 8.
H. K. Fatlawi, « Enhanced Classification Model for Cervical Cancer Dataset based on Cost Sensitive Classifier », vol. 4, no 4, p. 6, 2017.
Y. M. S. Al-Wesabi, A. Choudhury, et D. Won, « Classification of Cervical Cancer Dataset », p. 6.
S. F. Abdoh, M. Abo Rizka, et F. A. Maghraby, « Cervical Cancer Diagnosis Using Random Forest Classifier With SMOTE and Feature Reduction Techniques », IEEE Access, vol. 6, p. 59475 59485, 2018, doi: 10.1109/ACCESS.2018.2874063.
Asadi F, Salehnasab C, et Ajori L, « Supervised Algorithms of Machine Learning for the Prediction of Cervical Cancer », J. Biomed. Phys. Eng., vol. 10, no 4, août 2020, doi: 10.31661/jbpe.v0i0.1912-1027.
M. F. Zorkafli, M. K. Osman, I. S. Isa, F. Ahmad, et S. N. Sulaiman, « Classification of Cervical Cancer Using Hybrid Multi-layered Perceptron Network Trained by Genetic Algorithm », Procedia Comput. Sci., vol. 163, p. 494 501, 2019, doi: 10.1016/j.procs.2019.12.132.
Kumar, S., Gornale, S. S., Siddalingappa, R., & Mane, A. (2022). Gender Classification Based on Online Signature Features using Machine Learning Techniques. International Journal of Intelligent Systems and Applications in Engineering, 10(2), 260–268. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/2020
« UCI Machine Learning Repository: Cervical cancer (Risk Factors) Data Set », 2017. https://archive.ics.uci.edu/ml/datasets/Cervical+cancer+%28Risk+Factors%29 (consulté le 17 juillet 2021).
N. V. Chawla, K. W. Bowyer, L. O. Hall, et W. P. Kegelmeyer, « SMOTE: Synthetic Minority Over-sampling Technique », J. Artif. Intell. Res., vol. 16, p. 321 357, juin 2002, doi: 10.1613/jair.953.
EL-YAHYAOUI, A., & OMARY, F. (2022). An improved Framework for Biometric Database’s privacy. International Journal of Communication Networks and Information Security (IJCNIS), 13(3). https://doi.org/10.17762/ijcnis.v13i3.5143
J. Jeremiah Tanimu, M. Hamada, M. Hassan, et S. Yusuf Ilu, « A Contemporary Machine Learning Method for Accurate Prediction of Cervical Cancer », SHS Web Conf., vol. 102, p. 04004, 2021, doi: 10.1051/shsconf/202110204004.
![Flowchart of the proposed solution](https://ijisae.org/public/journals/1/submission_2284_2568_coverImage_en_US.png)
Downloads
Published
How to Cite
Issue
Section
License
![Creative Commons License](http://i.creativecommons.org/l/by-sa/4.0/88x31.png)
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.