Analysis of Large SARS-CoV-2 Data using Scalable Genetic Algorithm with Enhanced Bi-LSTM Method

Authors

  • Upendra Singh Research Scholar, Department of Computer Science & Engineering 2Dr. A.P.J Abdul Kalam University, Indore, MP, India
  • Ajay Raundale Research Supervisor, Department of Computer Science & Engineering Dr. A.P.J Abdul Kalam University, Indore, MP, India

Keywords:

COVID-19, SARS-CoV-2, Scalable model, Big Data, Genetic Algorithm (GA)

Abstract

Corona Virus Disease 2019 (COVID-19), caused by the Severe Acute Respiratory Syndrome Coronavirus-2 (SARS-CoV-2) virus, which emerged in late 2019, is now spreading rapidly throughout the world and has reached most countries. Knowledge has led to researching the outbreak’s spread and growth. The increase in cases can lead to an increase in the size of SARS-CoV-2 datasets. The scalable model needs to be developed to handle very large SARS-CoV-2 datasets. This paper proposes a scalable machine learning algorithm for huge SARS-CoV-2 prediction. A Scalable Susceptible–Infected (SSI) model consisting of a Scalable Genetic Algorithm with enhanced bi-LSTM is proposed to predict coronavirus disease using a big data framework. The experimental results of epidemic data from several cities indicate that people infected from SARS-CoV-2 show more infection in the latter part of the first week from getting infected; this has relevance to the epidemic’s transmission rate. In addition, relative to conventional models for the epidemic, the hybrid model will substantially reduce prediction results errors and achieve better average absolute percentage errors (MAPEs).

Downloads

Download data is not yet available.

References

Arora, P., Kumar, H., Panigrahi, B.K., 2020. Prediction and analysis of covid19 positive cases using deep learning models: A descriptive case study of india. Chaos, Solitons & Fractals 139, 110017.

Babatunde, O.H., Armstrong, L., Leng, J., Diepeveen, D., 2014. A genetic algorithm-based feature selection .

Bai, Y., Yao, L., Wei, T., Tian, F., Jin, D.Y., Chen, L., Wang, M., 2020. Presumed asymptomatic carrier transmission of covid-19. Jama 323, 1406– 1407.

Benbrahim, H., Hachimi, H., Amine, A., 2020. Deep transfer learning with apache spark to detect covid-19 in chest x-ray images. Romanian Journal of Information Science and Technology 23, S117–S129.

Berge, T., Lubuma, J.S., Moremedi, G., Morris, N., Kondera-Shava, R., 2017. A simple mathematical model for ebola in africa. Journal of biological dynamics 11, 42–74.

Bhosale, H.S., Gadekar, D.P., 2014. A review paper on big data and hadoop. International Journal of Scientific and Research Publications 4, 1–7. Borthakur, D., et al., 2008. Hdfs architecture guide. Hadoop Apache Project 53, 2.

Chafekar, D., Xuan, J., Rasheed, K., 2003. Constrained multi-objective optimization using steady state genetic algorithms, in: Genetic and Evolutionary Computation Conference, Springer. pp. 813–824.

Chai, T., Draxler, R.R., 2014. Root mean square error (rmse) or mean absolute error (mae). Geoscientific Model Development Discussions 7, 1525–1534.

Cho, K., Van Merrienboer, B., Bahdanau, D., Bengio, Y., 2014a. On the prop-¨ erties of neural machine translation: Encoder-decoder approaches. arXiv preprint arXiv:1409.1259 .

Cho, K., Van Merrienboer, B., Gulcehre, C., Bahdanau, D., Bougares, F.,¨ Schwenk, H., Bengio, Y., 2014b. Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 .

Costa, J.P., Grobelnik, M., Fuart, F., Stopar, L., Epelde, G., Fischaber, S., Poliwoda, P., Rankin, D., Wallace, J., Black, M., et al., 2020. Meaningful big data integration for a global covid-19 strategy. IEEE Computational Intelligence Magazine 15, 51–61.

De Myttenaere, A., Golden, B., Le Grand, B., Rossi, F., 2016. Mean absolute percentage error for regression models. Neurocomputing 192, 38–48.

Devaraj, J., Elavarasan, R.M., Pugazhendhi, R., Shafiullah, G., Ganesan, S., Jeysree, A.K., Khan, I.A., Hossain, E., 2021. Forecasting of covid-19 cases using deep learning models: Is it reliable and practically significant? Results in physics 21, 103817.

El Zowalaty, M.E., Jarhult, J.D., 2020. From sars to covid-19: A previously un-¨ known sars-related coronavirus (sars-cov-2) of pandemic potential infecting humans–call for a one health approach. One Health 9, 100124.

Elghamrawy, S., 2020. An h 2 o’s deep learning-inspired model based on big data analytics for coronavirus disease (covid-19) diagnosis, in: Big Data Analytics and Artificial Intelligence Against COVID-19: Innovation Vision and Approach. Springer, pp. 263–279.

Elmeiligy, M.A., Desouky, A.I.E., Elghamrawy, S.M., 2020a. A multidimensional big data storing system for generated covid-19 large-scale data using apache spark. arXiv preprint arXiv:2005.05036 .

Elmeiligy, M.A., Desouky, A.I.E., Elghamrawy, S.M., 2020b. A multidimensional big data storing system for generated covid-19 large-scale data using apache spark. arXiv preprint arXiv:2005.05036 .

Gajawada, S., 2019. Chi-square test for feature selection in machine learning.

Gers, F.A., Schmidhuber, J., Cummins, F., 1999. Learning to forget: Continual prediction with lstm .

Ghareb, A.S., Bakar, A.A., Hamdan, A.R., 2016. Hybrid feature selection based on enhanced genetic algorithm for text categorization. Expert Systems with Applications 49, 31–47.

Greff, K., Srivastava, R.K., Koutn´ık, J., Steunebrink, B.R., Schmidhuber, J., 2016. Lstm: A search space odyssey. IEEE transactions on neural networks and learning systems 28, 2222–2232.

Jamshidi, M., Lalbakhsh, A., Talla, J., Peroutka, Z., Hadjilooei, F., Lalbakhsh, P., Jamshidi, M., La Spada, L., Mirmozafari, M., Dehghani, M., et al., 2020. Artificial intelligence and covid-19: deep learning approaches for diagnosis and treatment. IEEE Access 8, 109581–109595.

Jha, P., Tiwari, A., Bharill, N., Ratnaparkhe, M., Mounika, M., Nagendra, N., 2020. A novel scalable kernelized fuzzy clustering algorithms based on inmemory computation for handling big data. IEEE Transactions on Emerging Topics in Computational Intelligence .

Jiang, S., Chin, K.S., Wang, L., Qu, G., Tsui, K.L., 2017. Modified genetic algorithm-based feature selection combined with pre-trained deep neural network for demand forecasting in outpatient department. Expert systems with applications 82, 216–230.

Kazemi, S., Seied Hoseini, M.M., Abbasian-Naghneh, S., Rahmati, S.H.A., 2014. An evolutionary-based adaptive neuro-fuzzy inference system for intelligent short-term load forecasting. International transactions in operational research 21, 311–326.

Kermack, W.O., McKendrick, A.G., 1927. A contribution to the mathematical theory of epidemics. Proceedings of the royal society of london. Series A, Containing papers of a mathematical and physical character 115, 700–721.

Khashan, E.A., Eldesouky, A.I., Fadel, M., Elghamrawy, S.M., 2020. A big data based framework for executing complex query over covid-19 datasets (covid-qf). arXiv preprint arXiv:2005.12271 .

Lai, C.C., Shih, T.P., Ko, W.C., Tang, H.J., Hsueh, P.R., 2020. Severe acute respiratory syndrome coronavirus 2 (sars-cov-2) and coronavirus disease2019 (covid-19): The epidemic and the challenges. International journal of antimicrobial agents 55, 105924.

Li, M.Y., Graef, J.R., Wang, L., Karsai, J., 1999. Global dynamics of a seir model with varying total population size. Mathematical biosciences 160, 191–213.

Li, R., Hu, H., Li, H., Wu, Y., Yang, J., 2016. Mapreduce parallel programming model: a state-of-the-art survey. International Journal of Parallel Programming 44, 832–866.

Liu, Y., Yin, Y., Gao, J., Tan, C., 2008. Wrapper feature selection optimized svm model for demand forecasting, in: 2008 The 9th International Conference for Young Computer Scientists, IEEE. pp. 953–958.

Mercioni, M.A., Holban, S., 2020. P-swish: Activation function with learnable parameters based on swish activation function in deep learning, in: 2020 International Symposium on Electronics and Telecommunications (ISETC), IEEE. pp. 1–4.

Michalak, K., Kwasnicka, H., 2006. Correlation-based feature selection strategy in neural classification, in: Sixth international conference on intelligent systems design and applications, IEEE. pp. 741–746.

Mikolov, T., Karafiat, M., Burget, L.,´ Cernockˇ y, J., Khudanpur, S., 2010. Re-` current neural network based language model, in: Eleventh annual conference of the international speech communication association.

Miller, B.L., Goldberg, D.E., et al., 1995. Genetic algorithms, tournament selection, and the effects of noise. Complex systems 9, 193–212.

Ng, T.W., Turinici, G., Danchin, A., 2003. A double epidemic model for the sars propagation. BMC Infectious Diseases 3, 1–16.

Oussous, A., Benjelloun, F.Z., Lahcen, A.A., Belfkih, S., 2018. Big data technologies: A survey. Journal of King Saud University-Computer and Information Sciences 30, 431–448.

Rizkalla, C., Blanco-Silva, F., Gruver, S., 2007. Modeling the impact of ebola and bushmeat hunting on western lowland gorillas. EcoHealth 4, 151–155.

Shahid, F., Zameer, A., Muneeb, M., 2020. Predictions for covid-19 with deep learning models of lstm, gru and bi-lstm. Chaos, Solitons & Fractals 140, 110212.

Shastri, S., Singh, K., Kumar, S., Kour, P., Mansotra, V., 2021. Deep-lstm ensemble framework to forecast covid-19: an insight to the global pandemic. International Journal of Information Technology , 1–11.

Small, M., Shi, P., Tse, C.K., 2004. Plausible models for propagation of the sars virus. IEICE transactions on fundamentals of electronics, communications and computer sciences 87, 2379–2386.

Urraca, R., Sanz-Garc´ıa, A., Fernandez-Ceniceros, J., Sodupe-Ortega, E.,´ Martinez-de Pison, F., 2015. Improving hotel room demand forecasting with a hybrid ga-svr methodology based on skewed data transformation, feature selection and parsimony tuning, in: International Conference on Hybrid Artificial Intelligence Systems, Springer. pp. 632–643.

Veiga, J., Exposito, R.R., Pardo, X.C., Taboada, G.L., Tourifio, J., 2016. Per-´ formance evaluation of big data frameworks for large-scale data analytics, in: 2016 IEEE International Conference on Big Data (Big Data), IEEE. pp. 424–431.

Wanyan, T., Vaid, A., De Freitas, J.K., Somani, S., Miotto, R., Nadkarni, G.N., Azad, A., Ding, Y., Glicksberg, B.S., 2020. Relational learning improves prediction of mortality in covid-19 in the intensive care unit. IEEE Transactions on Big Data .

Wilder-Smith, A., Chiew, C.J., Lee, V.J., 2020. Can we contain the covid-19 outbreak with the same measures as for sars? The lancet infectious diseases 20, e102–e107.

Yang, Z., Zeng, Z., Wang, K., Wong, S.S., Liang, W., Zanin, M., Liu, P., Cao, X., Gao, Z., Mai, Z., et al., 2020a. Modified seir and ai prediction of the epidemics trend of covid-19 in china under public health interventions. Journal of thoracic disease 12, 165.

Yang, Z., Zeng, Z., Wang, K., Wong, S.S., Liang, W., Zanin, M., Liu, P., Cao, X., Gao, Z., Mai, Z., et al., 2020b. Modified seir and ai prediction of the epidemics trend of covid-19 in china under public health interventions. Journal of thoracic disease 12, 165.

Ying, S., Li, F., Geng, X., Li, Z., Du, X., Chen, H., Chen, S., Zhang, M., Shao, Z., Wu, Y., et al., 2020. Spread and control of covid-19 in china and their associations with population movement, public health emergency measures, and medical resources. MedRxiv .

Zakary, O., Rachik, M., Elmouki, I., 2016. On the impact of awareness programs in hiv/aids prevention: an sir model with optimal control. Int. J. Comput. Appl 133, 1–6.

Zheng, N., Du, S., Wang, J., Zhang, H., Cui, W., Kang, Z., Yang, T., Lou, B., Chi, Y., Long, H., et al., 2020. Predicting covid-19 in china using hybrid ai model. IEEE transactions on cybernetics 50, 2891–2904.

Makarand L, M. . (2021). Earlier Detection of Gastric Cancer Using Augmented Deep Learning Techniques in Big Data with Medical Iot (Miot). Research Journal of Computer Systems and Engineering, 2(2), 22:26. Retrieved from https://technicaljournals.org/RJCSE/index.php/journal/article/view/28

M, T. ., & K, P. . (2023). An Enhanced Expectation Maximization Text Document Clustering Algorithm for E-Content Analysis. International Journal on Recent and Innovation Trends in Computing and Communication, 11(1), 12–19. https://doi.org/10.17762/ijritcc.v11i1.5982

Downloads

Published

12.07.2023

How to Cite

Singh, U. ., & Raundale, A. . (2023). Analysis of Large SARS-CoV-2 Data using Scalable Genetic Algorithm with Enhanced Bi-LSTM Method. International Journal of Intelligent Systems and Applications in Engineering, 11(9s), 59–79. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/3096

Issue

Section

Research Article