Hybrid Deep Learning and Optimization Algorithm for Breast Cancer Prediction Using Data Mining
Keywords:Breast cancer prediction, Mining, LDA feature extraction, WHO model, hyper parameters fine-tuning, AERNN model
Breast Cancer is the uncontrollable growth of cells by abnormal activities of genes. 2 in 10 women in world will be identified with breast cancer in her lifetime. On average, every 5 minutes a woman is identified with breast cancer in the world. So, there is a huge need for intelligent early prediction methods to support a health care peoples for increasing the survival rate of the patients. Recently, data mining approach of Deep Learning (DL) and machine learning (ML) contributes beneficial role in medical field for detection and classifications of diseases. The accuracy of prediction is reduced due to the imbalanced nature of data with unequal distribution of the positive and negative classes. To overcome this issue, the breast cancer prediction is presented by using a Hybrid algorithm such as Linear Discriminant Analysis (LDA), Wild Horse Optimization (WHO) and Advanced Elman Recurrent Neural Network (AERNN) methods in this work. A LDA model is used to remove a features, WHO model is used for feature reduction and tuning a AERNN’s hyper parameters and Optimized AERNN model for classifications. The proposed method has outperformed by achieving a result of Precision (98.51%), Recall (98.65%), Accuracy (97.88%) and F1 score (98.32%) and also in the performances of error evaluation of RMSE (1.006) and MAE (1.986) than the prior methods respectively.
A. H. Osman, “An Enhanced Breast Cancer Diagnosis Scheme based on Two-Step-SVM Technique,” Int. J. Adv. Comput. Sci. Appl., 2017.
Q. Zou, K. Qu, Y. Luo, D. Yin, Y. Ju, and H. Tang, “Predicting Diabetes Mellitus With Machine Learning Techniques,” Front. Genet., 2018, DOI: 10.3389/fgene.2018.00515.
Nurhayati and A. N. Rahman, “Implementation of Naive Bayes and K-Nearest Neighbor Algorithm for Diagnosis of Diabetes Mellitus,” Proc. 13th Int. Conf. Appl. Comput. Appl. Comput. Sci. (ACACOS ’14), pp. 117–120, 2014.
B. Alić, L. Gurbeta, and A. Badnjević, “Machine learning techniques for classification of diabetes and cardiovascular diseases,” 2017, DOI: 10.1109/MECO.2017.7977152.
S. Nashif, M. R. Raihan, M. R. Islam, and M. H. Imam, “Heart Disease Detection by Using Machine Learning Algorithms and a Real-Time Cardiovascular Health Monitoring System,” World J. Eng. Technol., 2018, DOI: 10.4236/wjet.2018.64057.
K. Kourou, T. P. Exarchos, K. P. Exarchos, M. V Karamouzis, and D. I. Fotiadis, “Machine learning applications in cancer prognosis and prediction,” Computational and Structural Biotechnology Journal. 2015, DOI: 10.1016/j.csbj.2014.11.005.
Zhang X, Sun Y. Breast cancer risk prediction model based on C5. 0 algorithm for postmenopausal women. In2018 International Conference on Security, Pattern Analysis, and Cybernetics (SPAC) 2018 Dec 14, pp. 321-325. IEEE.
Fu, B., Liu, P., Lin, J., Deng, L., Hu, K. and Zheng, H., 2018. Predicting invasive disease-free survival for early stage breast cancer patients using follow-up clinical data. IEEE Transactions on Biomedical Engineering, 66(7), pp.2053-2064.
Jessica EO, Hamada M, Yusuf SI, Hassan M. The Role of Linear Discriminant Analysis for Accurate Prediction of Breast Cancer. In2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC) 2021 Dec 20, pp. 340-344. IEEE.
Polat K, Sentürk U. A novel ML approach to prediction of breast cancer: combining of mad normalization, KMC based feature weighting and AdaBoostM1 classifier. In2018 2nd International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT) 2018 Oct 19, pp. 1-4. Ieee.
Pawlovsky, A.P. and Nagahashi, M., 2014, June. A method to select a good setting for the kNN algorithm when using it for breast cancer prognosis. In IEEE-EMBS International conference on biomedical and health informatics (BHI), pp. 189-192. IEEE.
Liu, P., Fu, B., Yang, S.X., Deng, L., Zhong, X. and Zheng, H., 2020. Optimizing survival analysis of XGBoost for ties to predict disease progression of breast cancer. IEEE Transactions on Biomedical Engineering, 68(1), pp.148-160.
Karim, M.R., Wicaksono, G., Costa, I.G., Decker, S. and Beyan, O., 2019. Prognostically relevant subtypes and survival prediction for breast cancer based on multimodal genomics data. IEEE Access, 7, pp.133850-133864.
Sun, D., Wang, M., Feng, H. and Li, A., 2017, October. Prognosis prediction of human breast cancer by integrating deep neural network and support vector machine: supervised feature extraction and classification for breast cancer prognosis prediction. In 2017 10th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI), pp. 1-5. IEEE.
Liang, M., Huang, L. and Ahmad, W., 2017, December. Breast cancer intelligent diagnosis based on subtractive clustering adaptive neural fuzzy inference system and information gain. In 2017 International Conference on Computer Systems, Electronics and Control (ICCSEC), pp. 152-156. Ieee.
Sun, D., Wang, M. and Li, A., 2018. A multimodal deep neural network for human breast cancer prognosis prediction by integrating multi-dimensional data. IEEE/ACM transactions on computational biology and bioinformatics, 16(3), pp.841-850.
Alghunaim, S. and Al-Baity, H.H., 2019. On the scalability of machine-learning algorithms for breast cancer prediction in big data context. IEEE Access, 7, pp. 91535-91546.
Agustian, F. and Lubis, M.D.I., 2020, October. Particle swarm optimization feature selection for breast cancer prediction. In 2020 8th International Conference on Cyber and IT Service Management (CITSM), pp. 1-6. IEEE.
Osman, A.H. and Aljahdali, H.M.A., 2020. An effective of ensemble boosting learning method for breast cancer virtual screening using neural network model. IEEE Access, 8, pp.39165-39174.
Zhang, X., He, D., Zheng, Y., Huo, H., Li, S., Chai, R. and Liu, T., 2020. Deep learning based analysis of breast cancer using advanced ensemble classifier and linear discriminant analysis. IEEE Access, 8, pp.120208-120217.
Nguyen, T., Lee, S.C., Quinn, T.P., Truong, B., Li, X., Tran, T., Venkatesh, S. and Le, T.D., 2021. PAN: Personalized Annotation-based Networks for the Prediction of Breast Cancer Relapse. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 18(6), pp.2841-2847.
Hsu, T.C. and Lin, C., 2020, July. Generative adversarial networks for robust breast cancer prognosis prediction with limited data size. In 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), pp. 5669-5672. IEEE.
Liu, P. and Fei, S., 2020. Two-stage prediction of comorbid cancer patient survivability based on improved infinite feature selection. IEEE Access, 8, pp.169559-169567.
Arya, N. and Saha, S., 2020. Multi-modal classification for human breast cancer prognosis prediction: proposal of deep-learning based stacked ensemble model. IEEE/ACM transactions on computational biology and bioinformatics.
Raweh, A.A., Nassef, M. and Badr, A., 2018. A hybridized feature selection and extraction approach for enhancing cancer prediction based on DNA methylation. IEEE Access, 6, pp.15212-15223.
Waseem, M.H., Nadeem, M.S.A., Abbas, A., Shaheen, A., Aziz, W., Anjum, A., Manzoor, U., Balubaid, M.A. and Shim, S.O., 2019. On the feature selection methods and reject option classifiers for robust cancer prediction. IEEE Access, 7, pp.141072-141082.
Zhang, D., Zou, L., Zhou, X. and He, F., 2018. Integrating feature selection and feature extraction methods with deep learning to predict clinical outcome of breast cancer. IEEE Access, 6, pp.28936-28944.
Agarap, A.F.M., 2018, February. On breast cancer detection: an application of machine learning algorithms on the wisconsin diagnostic dataset. In Proceedings of the 2nd international conference on machine learning and soft computing, pp. 5-9.
Izenman, A. J., 2013. Linear discriminant analysis. In Modern multivariate statistical techniques, pp. 237-280. Springer, New York, NY.
Naruei, I., & Keynia, F., 2021. Wild horse optimizer: A new meta-heuristic algorithm for solving engineering optimization problems. Engineering with Computers, 1-32.
Übeyli, E. D., 2009. Combining recurrent neural networks with eigenvector methods for classification of ECG beats. Digital Signal Processing, 19(2), 320-329.
How to Cite
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.