An Elevated Cancer Detection Methodology Using a Hybrid Transfer Learning Approach for Classifying Microarray Cancer Data

Authors

  • Swati Sucharita Department of Computer Science and Engineering, Siksha ‘O’ Anusandhan (Deemed to be University), Bhubaneswar, India
  • Barnali Sahu Department of Computer Science and Engineering, Siksha ‘O’ Anusandhan (Deemed to be University), Bhubaneswar, India
  • Tripti Swarnkar Department of Computer Application Siksha ‘O’ Anusandhan (Deemed to be University), Bhubaneswar, India

Keywords:

Cancer, microarray data, Machine Learning, Deep Learning, feature filtering, feature selection convolutional neural network, long-term memory, Transfer Learning and classification

Abstract

Cancer has emerged as a prominent issue over the last decade, necessitating timely identification for effective treatment. Consequently, the development of an automated diagnostic system for precise cancer detection has considerable significance. The analysis of microarray data is a significant challenge due to the complexity of the data, limited sample size, uneven distribution of classes, the presence of noisy data structure, and more variability in feature values. The existing Machine Learning (ML) led to lesser classification accuracy and its training is proven to be quite expensive with over-fitting problem. In contrast to the drawbacks of conventional machine learning (ML) methods like support vector machines, decision trees, logistic regression, etc., deep learning (DL) methods provide a variety of benefits including large data compatibility, automated feature engineering, and ease of use. The objective of this study is to propose the development of Mote Carlo Relief-F feature filtering (MCRelief-F) as a feature estimator that can effectively provide quality assessments of features while dealing with intricate situations characterized by significant interdependencies among features. When it comes to feature selection, SOT (scyphozoan optimization technique) is more successful than traditional approaches because it has better convergence ability, search stability, and optimum-seeking ability. The Hybrid Extensive Kernel Convolutional Neural Network (HEKCNN-LSTM-TL) uses an extensive convolution kernel for local convolution and long-term memory (LSTM) with transfer learning to improve classification accuracy while shortening training times. The suggested technique is tested on three common microarray cancer datasets, including brain, breast, and leukemia, and the feature values are scaled using artificial SOT. Classification accuracy, precision, f-measure, specificity, sensitivity, and MCC are used to assess how well the given strategy performs. A comparison of the suggested approach's performance with state-of-the-art approaches is done, and the results show that it performs better than many of the current methods, notably on the leukemia dataset.

Downloads

Download data is not yet available.

References

Sung H., Ferlay J., Siegel R. L., Laversanne M., Soerjomataram I., Jemal A., et al. (2021). Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. Ca. Cancer J. Clin. 71 (3), 209–249.

Id J. L., Zhou Z., Dong J., Fu Y., Li Y., Luan Z., et al. (2021). Predicting breast cancer 5-year survival using machine learning: A systematic review. PLoS One 16, e0250370–23.

Chandra, B., & Gupta, M. (2011). An efficient statistical feature selection approach for classification of gene expression data. Journal of biomedical informatics, 44(4), 529-535.

Huang, S., Yang, J., Fong, S., & Zhao, Q. (2020). Artificial intelligence in cancer diagnosis and prognosis: Opportunities and challenges. Cancer letters, 471, 61-71.

Daoud, M., & Mayo, M. (2019). A survey of neural network-based cancer prediction models from microarray data. Artificial intelligence in medicine, 97, 204-214.

Liu, B., Cui, Q., Jiang, T., & Ma, S. (2004). A combinational feature selection and ensemble neural network method for classification of gene expression data. BMC bioinformatics, 5, 1-12.

Shah, S. H., Iqbal, M. J., Ahmad, I., Khan, S., & Rodrigues, J. J. (2020). Optimized gene selection and classification of cancer from microarray gene expression data using deep learning. Neural Computing and Applications, 1-12.

Manikandan, G., & Abirami, S. (2018). A survey on feature selection and extraction techniques for high-dimensional microarray datasets. Knowledge Computing and its Applications: Knowledge Computing in Specific Domains: Volume II, 311-333.

Musheer, R. A., Verma, C. K., & Srivastava, N. (2019). Novel machine learning approach for classification of high-dimensional microarray data. Soft Computing, 23, 13409-13421.

Daoud, M., & Mayo, M. (2019). A survey of neural network-based cancer prediction models from microarray data. Artificial intelligence in medicine, 97, 204-214.

Gupta, S., Gupta, M. K., Shabaz, M., & Sharma, A. (2022). Deep learning techniques for cancer classification using microarray gene expression data. Frontiers in Physiology, 2022.

Kar, S., Sharma, K. D., & Maitra, M. (2015). Gene selection from microarray gene expression data for classification of cancer subgroups employing PSO and adaptive K-nearest neighborhood technique. Expert Systems with Applications, 42(1), 612-627.

Mishra, P., & Bhoi, N. (2021). Cancer gene recognition from microarray data with manta ray based enhanced ANFIS technique. Biocybernetics and Biomedical Engineering, 41(3), 916-932.

Halder, A., & Kumar, A. (2019). Active learning using rough fuzzy classifier for cancer prediction from microarray gene expression data. Journal of Biomedical Informatics, 92, 103136.

Mazlan, A. U., Sahabudin, N. A., Remli, M. A., Ismail, N. S. N., & Adenuga, K. I. (2021). An enhanced feature selection and cancer classification for microarray data using relaxed Lasso and support vector machine. In Translational Bioinformatics in Healthcare and Medicine (pp. 193-200). Academic Press.

Shafi, A. S. M., Molla, M. I., Jui, J. J., & Rahman, M. M. (2020). Detection of colon cancer based on microarray dataset using machine learning as a feature selection and classification techniques. SN Applied Sciences, 2, 1-8.

Hambali, M. A., Oladele, T. O., Adewole, K. S., Sangaiah, A. K., & Gao, W. (2022). Feature selection and computational optimization in high-dimensional microarray cancer datasets via InfoGain-modified bat algorithm. Multimedia Tools and Applications, 81(25), 36505-36549.

Alrefai, N., Ibrahim, O., Shehzad, H. M. F., Altigani, A., Abu-ulbeh, W., Alzaqebah, M., & Alsmadi, M. K. (2022). An integrated framework based deep learning for cancer classification using microarray datasets. Journal of Ambient Intelligence and Humanized Computing, 1-12.

Alrefai, N., & Ibrahim, O. (2022). Optimized feature selection method using particle swarm intelligence with ensemble learning for cancer classification based on microarray datasets. Neural Computing and Applications, 34(16), 13513-13528.

Khaire, U. M., & Dhanalakshmi, R. (2020). High-dimensional microarray dataset classification using an improved adam optimizer (iAdam). Journal of Ambient Intelligence and Humanized Computing, 11(11), 5187-5204.

Zhang, D., Zou, L., Zhou, X., & He, F. (2018). Integrating feature selection and feature extraction methods with deep learning to predict clinical outcome of breast cancer. Ieee Access, 6, 28936-28944.

Brain cancer gene expression - CuMiDa | Kaggle

Breast cancer gene expression - CuMiDa | Kaggle

Leukemia gene expression - CuMiDa | Kaggle

Lattimore, T. (2016). Regret analysis of the anytime optimally confident UCB algorithm. arXiv preprint arXiv:1603.08661.

Khare, A., Kakandikar, G. M., & Kulkarni, O. K. (2022). An Insight Review on Jellyfish Optimization Algorithm and Its Application in Engineering. Journal homepage: http://iieta. org/journals/rces, 9(1), 31-40.

Zeebaree, D. Q., Haron, H., & Abdulazeez, A. M. (2018, October). Gene selection and classification of microarray data using convolutional neural network. In 2018 International Conference on Advanced Science and Engineering (ICOASE) (pp. 145-150). IEEE.

Şahín, C. B., & Diri, B. (2019). Robust feature selection with LSTM recurrent neural networks for artificial immune recognition system. IEEE Access, 7, 24165-24178.

George, B., Gokhale, S. D., Yaswanth, P. M., Vijayan, A., Devika, S., & Suchithra, T. V. (2022). Identification of Alzheimer associated differentially expressed gene through microarray data and transfer learning-based image analysis. Neuroscience Letters, 766, 136357.

Chou, J. S., & Molla, A. (2022). Recent advances in use of bio-inspired jellyfish search algorithm for solving optimization problems. Scientific Reports, 12(1), 19157.

Rani, R. U. ., Rao, P. S. ., Lavanaya, K. ., Satyanarayana, N. ., Lallitha, S. ., & Prasad J, P. . (2023). Optimization of Energy-Efficient Cluster Head Selection Algorithm for Internet of Things in Wireless Sensor Networks. International Journal on Recent and Innovation Trends in Computing and Communication, 11(4), 238–248. https://doi.org/10.17762/ijritcc.v11i4.6445

Smith, J., Ivanov, G., Petrović, M., Silva, J., & García, A. Detecting Fake News: A Machine Learning Approach. Kuwait Journal of Machine Learning, 1(3). Retrieved from http://kuwaitjournals.com/index.php/kjml/article/view/142

Dhabliya, D. (2019). Security analysis of password schemes using virtual environment. International Journal of Advanced Science and Technology, 28(20), 1334-1339. Retrieved from www.scopus.com

Downloads

Published

30.08.2023

How to Cite

Sucharita, S. ., Sahu, B. ., & Swarnkar, T. . (2023). An Elevated Cancer Detection Methodology Using a Hybrid Transfer Learning Approach for Classifying Microarray Cancer Data. International Journal of Intelligent Systems and Applications in Engineering, 11(11s), 336–351. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/3478

Issue

Section

Research Article