Lung Cancer Detection, Prediction and Analysis of Lifestyle Parameters using ML and AI Techniques

Authors

  • Sarika Davare, Vishal Shirsath, Farook Sayyad

Keywords:

Artificial Intelligence, Lifestyle, Lung Cancer, Machine Learning, Prediction

Abstract

Cancer poses a significant threat to human life, often diagnosed in later stages, highlighting the crucial need for early prediction. Literature extensively explores Machine Learning, Data Mining, and Artificial Intelligence techniques for the identification, classification, prediction, and detection of various cancers like lung, breast, prostate, skin, liver, and recurrence cancer. Predictive models rely on vast datasets for cancer prediction. Lung cancer's development is closely tied to lifestyle factors such as smoking, air pollution, and diet imbalance, emphasizing the potential of lifestyle indicators in early detection. A study focuses on constructing a model using lifestyle data to predict lung cancer and categorize its severity. Basic lifestyle parameters are initially examined, and if lung cancer potential is indicated, a second component of the model further analyzes each parameter to predict cancer level. Various Machine Learning techniques including Support Vector Machine, Logistic Regression, and Linear regression are applied to predict lung cancer risk and level with analysis of prediction using AI techniques. SVM emerges as the most effective classifier for predicting lung cancer risk based on lifestyle factors, while linear regression is optimal for total risk score prediction. Additionally, gender- and age-specific lifestyle parameters contributing to lung cancer are identified. The study's preliminary phase employs logistic regression and Support Vector Machine to predict lung cancer, achieving high accuracies of 94% and 90% respectively. The subsequent component utilizes SVM, Random Forest, KNN and Linear Regression to estimate cancer malignancy levels, with accuracies reaching 98% , 96 and 97% respectively. The study aims to predict lung cancer early using lifestyle data, offering insights into risk factors and preventive strategies.

Downloads

Download data is not yet available.

References

Nair B., Jeevakumar, A., & Anju, K. (n.d.). Tobacco Smoking Induced Lung Cancer Prediction By LC-MicroRNAs Secondary Structure Prediction And Target Comparison. 2017 2nd International Conference for Convergence in Technology (I2CT).

Bostean G., Crespi, C. M., & McCarthy, W. J. (8 2013). Associations among family history of cancer, cancer screening and lifestyle behaviors: A population-based study. Cancer Causes and Control, 24, 1491–1503. doi:10.1007/s10552-013-0226-9

Jeon J., Du, M., Schoen, R. E., Hoffmeister, M., Newcomb, P. A., Berndt, S. I., … Hsu, L. (6 2018). Determining Risk of Colorectal Cancer and Starting Age of Screening Based on Lifestyle, Environmental, and Genetic Factors. Gastroenterology, 154, 2152-2164.e19. doi:10.1053/j.gastro.2018.02.021

Carr P. R., Weigl, K., Edelmann, D., Jansen, L., Chang-Claude, J., Brenner, H., & Hoffmeister, M. (7 2020). Estimation of Absolute Risk of Colorectal Cancer Based on Healthy Lifestyle, Genetic Risk, and Colonoscopy Status in a Population-Based Study. Gastroenterology, 159, 129-138.e9. doi:10.1053/j.gastro.2020.03.016

Pati J. (2019). Gene expression analysis for early lung cancer prediction using machine learning techniques: An eco-genomics approach. IEEE Access, 7, 4232–4238. doi:10.1109/ACCESS.2018.2886604

Aleksandrova K., Reichmann, R., Kaaks, R., Jenab, M., Bueno-de-Mesquita, H. B., Dahm, C. C., … Gunter, M. J. (12 2021). Development and validation of a lifestyle-based model for colorectal cancer risk prediction: the LiFeCRC score. BMC Medicine, 19. doi:10.1186/s12916-020-01826-0

Chen, H., Liu, L., Lu, M., Zhang, Y., Lu, B., Zhu, Y., … Dai, M. (7 2021). Implications of Lifestyle Factors and Polygenic Risk Score for Absolute Risk Prediction of Colorectal Neoplasm and Risk-Adapted Screening. Frontiers in Molecular Biosciences, 8. doi:10.3389/fmolb.2021.685410

Nii M., Momimoto, M., Kobashi, S., Kamiura, N., Hata, Y., & Sorachi, K. I. (3 2016). Medical Checkup and Image Data Analysis for Preventing Life Style Diseases: A Research Survey of Japan Society for the Promotion of Science with Grant-in-Aid for Scientific Research (A) (Grant number 25240038). 2016-March, 117–122. doi:10.1109/ICETET.2015.38

Prachiti Gholap, V. P. A. P. (n.d.). DiseaseLens: A Lifestyle related Disease Predictor. Published in: 2022 5th International Conference on Advances in Science and Technology (ICAST) Date of Conference: 02-03 December 2022 Date Added to IEEE Xplore: 13 February 2023.

Liao W., Coupland, C. A. C., Burchardt, J., & Baldwin, D. R. (n.d.). Predicting the future risk of lung cancer: development, and internal and external validation of the CanPredict (lung) model in 19·67 million people and evaluation of model performance against seven other risk prediction models.

Xie P., Huang, X., Lin, D., Huang, X., Lin, S., Luo, S., … Weng, X. (n.d.). Long-term trend of future Cancer onset: A model-based prediction of Cancer incidence and onset age by region and gender. Preventive Medicine Volume 177, December 2023, 107775.

Yang J., Wen, W., Zahed, H., & Zheng, W. (n.d.). Lung Cancer Risk Prediction Models for Asian Ever-Smokers. Journal of Thoracic Oncology Volume 19, Issue 3, March 2024, Pages 451-464, Https://Doi. Org/10. 1016/j. Jtho. 2023. 11. 002.

Hui, L. (2024). Changes in threats from chronic obstructive pulmonary disorder and lung cancer with environmental improvements in China: Quantitative evaluation and prediction based on a model with age as a probe. Heliyon Journal Homepage: Www. Cell. Com/Heliyon,April 01, 2024,DOI:Https://Doi. Org/10. 1016/j. Heliyon. 2024. E28977.

Zhao D., Lu, J., Zeng, W., Zhang, C., & You, Y. (n.d.). Changing trends in disease burden of lung cancer in China from 1990-2019 and following 15-year prediction. Current Problems in Cancer Volume 48, February 2024, 101036, Https://Doi. Org/10. 1016/j. Currproblcancer. 2023. 101036.

A., M., Zulkifley, M. A., & Zainuri, M. A. A. M. (n.d.). A Review of Deep Learning Techniques for Lung Cancer Screening and Diagnosis Based on CT Images. S. Diagnostics 2023, 13, 2617. Https://Doi. Org/10. 3390/ Diagnostics13162617.

Endalie D., & Abebe, W. T. (7 2023). Analysis of lung cancer risk factors from medical records in Ethiopia using machine learning. PLOS Digital Health, 2, e0000308. doi:10.1371/journal.pdig.0000308

Mohan K., & Thayyil, B. (9 2023). Machine Learning Techniques for Lung Cancer Risk Prediction using Text Dataset. International Journal of Data Informatics and Intelligent Computing, Vol. 2, pp. 47–56. doi:10.59461/ijdiic.v2i3.73

Howell D., Analytica, Q., Buttery, R., Badrinath, P., George, A., Council, K. C., … Finnis, C. (2023). Developing a risk prediction tool for Lung Cancer in Kent and Medway, England: Cohort Study using linked Data. doi:10.21203/rs.3.rs-3100044/v1

Shaoo P., Omrah, M., Somit, D., & Kumar, R. (10 2022). Lung Cancer Prediction Using Machine Learning and Big Data. International Journal of Scientific Research in Computer Science Engineering and Information Technology, 9.

Li, H., Cheng, Z. J., Liang, Z., Liu, M., Liu, L., Song, Z., … Sun, B. (1 2023). Novel nutritional indicator as predictors among subtypes of lung cancer in diagnosis. Frontiers in Nutrition, 10. doi:10.3389/fnut.2023.1042047

Abuya, T. K. (2023). Lung Cancer Prediction from Elvira Biomedical Dataset Using Ensemble Classifier with Principal Component Analysis. Journal of Data Analysis and Information Processing, 11, 175–199. doi:10.4236/jdaip.2023.112010

Idrissi S. E., Ben, I., Ouahab, A., Drider, Y., Bouhorma, M., & Ouaai, E. L. (2023). Prediction Of Lung Cancer Levels Based On Patient Lifestyle And Histopathological Images Using Artificial Intelligence. Journal of Theoretical and Applied Information Technology, Vol. 15. Retrieved from www.jatit.org

Yeo Y., Shin, D. W., Han, K., Park, S. H., Jeon, K. H., Lee, J., Kim, J., & Shin, A. (2021). Individual 5-year lung cancer risk prediction model in korea using a nationwide representative database. Cancers, 13(14), Article 3496. https://doi.org/10.3390/cancers13143496

Sim J., Kim, Y. A., Kim, J. H., Lee, J. M., Kim, M. S., Shim, Y. M., … Yun, Y. H. (12 2020). The major effects of health-related quality of life on 5-year survival prediction among lung cancer survivors: applications of machine learning. Scientific Reports, 10. doi:10.1038/s41598-020-67604-3

Anita Nath, Krishnan Sathishkumar, Priyanka Das, Kondalli Lakshminarayana Sudarshan, Prashant Mathur (2022). A clinic epidemiological profile of lung cancers in India – Results from the National Cancer Registry Program, Indian J Med Res 155, February 2022, pp 264-272, DOI: 10.4103/ijmr.ijmr_1364_21

Downloads

Published

12.06.2024

How to Cite

Sarika Davare. (2024). Lung Cancer Detection, Prediction and Analysis of Lifestyle Parameters using ML and AI Techniques. International Journal of Intelligent Systems and Applications in Engineering, 12(4), 25–36. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/6171

Issue

Section

Research Article