An Effective Data Analysis for Lymphomas Outcomes and Risk Levels using Combined Advanced Data Mining Algorithms

Authors

  • Manu M. R. Research scholar, Professor, SCSE Galgotias University, Greater Noida, Uttar Pradesh, India
  • T. Poongodi Research scholar, Professor, SCSE Galgotias University, Greater Noida, Uttar Pradesh, India

Keywords:

structured, combinational, independent, precision

Abstract

Nowadays cancer has become a vitasonr measurable global death., because insufficiency of proper treatment and also late diagnosis treatment. with the help of various data mining techniques can recognize or predict cancer diseases threat level with different dimensions covered by medical date. The data which give space to predict risk of cancer including different factors (both risk and investigation factors). Our proposed multidimensional analysis upgraded with cognitive experts (data management and data analysis) Also getting support with oncology domain experts for huge amounts of data and excellent analytical skills. Our proposed data mining analysis evaluated lymphomas of the entire body data extract through analytical methods in ontology. In general, Neural networks (NN), Clustering, Regression, Decision Tree (DT) performing with high accuracy and precision rate than association based and Naive Bayes Classifier. Analyzing with various data mining algorithms are ready for combination or disjoint to provide an effective medical data analysis with high accuracy. Data prediction analysis of caCORE, SEER dataset system and systemic experimental integration proved that distributive system is extremely complex. Providing some betterments made to simplify and an effective generic caCORE data model. Use divide and conquer basis split huge volume it into smaller portions and expose as combinational functionality as data hiding or web-oriented interface classes are decreasing the complexity of the model. Combined Data analysis models with multidimensional analysis (MDA) performed better than other models. Our proposed methodology gives a better results and diagnosis prediction  for Lymphoma analysis  for  different age group starting from age 25 to higher .  It was calculated and analyzed combined classifiers gave better results with accuracy of 88.17%, result of independent or structured data accuracy was 90.10%.

Downloads

Download data is not yet available.

References

Ries Lag, Melbert D et all. SEER Cancer Statistics Review, 1975-2004, National Cancer Institute, partially summarized on http://seer.cancer.gov/statfacts/html/melan.html

Bibault, J.E., Giraud, P., Housset, M., Durdux, C., Taieb, J., Berger, A., Coriat, R., Chaussade, S., Dousset, B., Nordlinger, B. and Burgun, A 2018, 'Deep learning and radiomics predict complete response after neo-adjuvant chemoradiation for locally advanced rectal cancer', Scientific reports, vol. 8 no. 1, pp.1-8.

Blanes-Vidal, V., Baatrup, G. and Nadimi, E.S 2019, 'Addressing priority challenges in the detection and assessment of colorectal polyps from capsule endoscopy and colonoscopy in colorectal cancer screening using machine learning'. Acta Oncologica, vol. 58 sup1, pp.S29-S36.

Borkowski, A.A., Wilson, C.P., Borkowski, S.A., Thomas, L.B., Deland, L.A. and Mastorides, S.M 2018, 'Apple machine learning algorithms successfully detect colon cancer but fail to predict KRAS mutation status'. arXiv preprint arXiv, vol. 1812.04660.

Bychkov, D., Linder, N., Turkki, R., Nordling, S., Kovanen, P.E., Verrill, C., Walliander, M., Lundin, M., Haglund, C. and Lundin, J 2018, 'Deep learning based tissue analysis predicts outcome in colorectal cancer', Scientific reports, vol. 8, no. 1, pp.1-11.

Bychkov, D., Turkki, R., Haglund, C., Linder, N. and Lundin, J 2016, 'Deep learning for tissue microarray image-based outcome prediction in patients with colorectal cancer', International Society for Optics and Photonics, vol. 9791, p. 979115.

Cârțână, E.T., Gheonea, D.I. and Săftoiu, A 2016. 'Advances in endoscopic ultrasound imaging of colorectal diseases', World journal of gastroenterology, vol. 22, no. 5, p.1756.

Caruso, S., Bazan, V., Rolfo, C., Insalaco, L., Fanale, D., Bronte, G., Corsini, L.R., Rizzo, S., Cicero, G. and Russo, A 2012. 'MicroRNAs in colorectal cancer stem cells: new regulators of cancer stemness?'. Oncogenesis, vol. 1, no. 11, pp.e32-e32.

Chen, L.D., Liang, J.Y., Wu, H., Wang, Z., Li, S.R., Li, W., Zhang, X.H., Chen, J.H., Ye, J.N., Li, X. and Xie, X.Y 2018. 'Multiparametric radiomics improve prediction of lymph node metastasis of rectal cancer compared with conventional radiomics', Life sciences, vol. 208, pp.55-63.

Cho, S.B. and Won, H.H 2003, 'Machine learning in DNA microarray analysis for cancer classification', In Proceedings of the First Asia-Pacific Bioinformatics Conference on Bioinformatics 2003, vol. 19, pp. 189-198.

Choi, Y.R., Kim, J.H., Park, S.J., Hur, B.Y. and Han, J.K 2017. 'Therapeutic response assessment using 3D ultrasound for hepatic metastasis from colorectal cancer: application of a personalized', 3D-printed tumor model using CT images, Plos one, vol. 12, no 8, p.e0182596.

CACORE SOFTWARE DEVELOPMENT KIT 3.2.1 Programmer’s Guide, U.S. Government work, Revised July 16, 2007

L. Breiman, J. Friedman, and C. J. Stone, and R. A. Olshen, “Classification and regression trees,” Routledge, 1st ed., 2017.

T. Chen, and C. Guestrin, “Xgboost: A scalable tree boosting system,” in Conf. Proc. ACM SIGKDD knowledge discovery and data mining, pp. 785-794, Mar. 2016.

G. A. Rao, G. Srinivas, K. V. Rao, and P. P. Reddy, “Characteristic mining of Mathematical Formulas from Document-A Comparative Study on Sequence Matcher and Levenshtein Distance procedure”, J. Comp. Sci. Eng., vol. 6, no. 4, pp. 400-404, Apr. 2018.

D. Morina and A. Navarro. The R package survsim for the simulation of simple and complex survival data. Journal of Statistical Software, 59(2):1–20, 2014.

Y. Li, K. S. Xu, and C. K. Reddy. Regularized parametric regression for high-dimensional survival analysis. In Proceedings of SIAM International Conference on Data Mining, pages 765–773, 2016.

A. Ezzat, M. Wu, X. L. Li, and C. K. Kwoh, “Computational prediction of drug-target interactions using chemogenomic approaches: an empirical survey,” Briefings in Bioinformatics, no. 8, 2018.

Z. Shen, Y.-H. Zhang, K. Han, A. K. Nandi, B. Honig, and D.-S. Huang, “miRNA-Disease Association Prediction with Collaborative Matrix Factorization,” Complexity, vol. 2017, pp. 9, 2017

M.-M. Gao, Z. Cui, Y.-L. Gao, J.-X. Liu, and C.-H. Zheng, “Dual-network sparse graph regularized matrix factorization for predicting miRNA–disease associations,” Molecular Omics, 2019.

Junyi Gao, Cao Xiao, Lucas M. Glass, Jimeng Sun, “COMPOSE: Cross-Modal Pseudo-Siamese Network for Patient Trial Matching,” ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp.803 – 812, 2020.

Chi Yuan, Patrick B Ryan, Casey Ta, Yixuan Guo, Ziran Li, Jill Hardin, Rupa Makadia, Peng Jin, Ning Shang, Tian Kang, et al. 2019. Criteria2Query: a natural language interface to clinical databases for cohort definition. Journal of the American Medical Informatics Association 26, 4 (2019), 294–305

Chunhua Weng, Samson W Tu, Ida Sim, and Rachel Richesson. 2010. Formal representation of eligibility criteria: a literature review. Journal of biomedical informatics 43, 3 (2010), 451–467.

David Soutar, Gerry Robertson, "CANCER SCENARIOS: An aid to planning cancer services in Scotland in the next decade", Canniesburn Hospital, Beatson Oncology Centre, Glasgow.

Nicholas Huba, Yan Zhang, “Designing Patient-Centered Personal Health Records (PHRs): Health Care Professionals' Perspective on Patient-Generated Data,” Journal of Medical Systems, vol 36, issue 6, dec 2012.

Xingyao Zhang, Cao Xiao, Lucas M. Glass, Jimeng Sun, “ DeepEnroll: Patient-Trial Matching with Deep Embedding and Entailment Prediction”, WWW '20: Proceedings of The Web Conference 2020, pp 1029 – 1037, 2020.

Antonio Boffa, Paolo Ferragina, Giorgio Vinciguerra, “A Learned Approach to Design Compressed Rank/Select Data Structures”, ACM Transactions on Algorithms (TALG), article accepted, mar 2022

Fears et al, Cancer Res. 2002

Ries Lag, et all, “SEER Cancer Statistics Review 1975-2000”, NCI; 2003

Besusparis, J., Laurinavicius, A. and Ilyas, M 2019, 'A cascade-learning approach for automated segmentation of tumour epithelium in colorectal cancer', Expert Systems with Applications, vol.118, pp.539-552.

Altini, N., Marvulli, T.M., Caputo, M., Mattioli, E., Prencipe, B., Cascarano, G.D., Brunetti, A., Tommasi, S., Bevilacqua, V., Summa, S.D. and Zito, F.A 2021, 'Multi-class Tissue Classification in Colorectal Cancer with Handcrafted and Deep Features', In International Conference on Intelligent Computing, Springer, pp. 512-525).

Babu, T., Singh, T., Gupta, D. and Hameed, S 2021, 'Colon cancer prediction on histological images using deep learning features and Bayesian optimized SVM', Journal of Intelligent & Fuzzy Systems, Preprint, pp.1-12.

Berger-Kulemann, V., Schima, W., Baroud, S., Koelblinger, C., Kaczirek, K., Gruenberger, T., Schindl, M., Maresch, J., Weber, M. and Ba-Ssalamah, A 2012. 'Gadoxetic acid-enhanced 3.0 T MR imaging versus multidetector-row CT in the detection of colorectal metastases in fatty liver using intraoperative ultrasound and histopathology as a standard of reference', European Journal of Surgical Oncology EJSO, vol. 38 no.8, pp.670-676.

Coenegrachts, K., Bols, A., Haspeslagh, M. and Rigauts, H 2012, 'Prediction and monitoring of treatment effect using T1-weighted dynamic contrast-enhanced magnetic resonance imaging in colorectal liver metastases: potential of whole tumour ROI and selective ROI analysis', European journal of radiology, vol. 81, no. 12, pp.3870-3876.

Dawson, I.M.P., Cornes, J.S. and Morson, B.C 1961, 'Primary malignant lymphoid tumours of the intestinal tract', Report of 37 cases with a study of factors influencing prognosis, British Journal of Surgery, vol. 49, no. 213, pp.80-89.

de Wit, M., Kant, H., Piersma, S.R., Pham, T.V., Mongera, S., van Berkel, M.P., Boven, E., Pontén, F., Meijer, G.A., Jimenez, C.R. and Fijneman, R.J 2014, 'Colorectal cancer candidate biomarkers identified by tissue secretome proteome profiling', Journal of proteomics, vol. 99, pp.26-39.

Ding, L., Liu, G., Zhang, X., Liu, S., Li, S., Zhang, Z., Guo, Y. and Lu, Y 2020. 'A deep learning nomogram kit for predicting metastatic lymph nodes in rectal cancer', Cancer Medicine, vol. 9, no. 23, pp.8809-8820.

Fan, N.J., Kang, R., Ge, X.Y., Li, M., Liu, Y., Chen, H.M. and Gao, C.F 2014, 'Identification alpha-2-HS-glycoprotein precursor and tubulin beta chain as serology diagnosis biomarker of colorectal cancer', Diagnostic pathology, vol. 9, no. 1, pp.1-11.

Fan, X.J., Wan, X.B., Huang, Y., Cai, H.M., Fu, X.H., Yang, Z.L., Chen, D.K., Song, S.X., Wu, P.H., Liu, Q. and Wang, L 2012. 'Epithelial–mesenchymal transition biomarkers and support vector machine guided model in preoperatively predicting regional lymph node metastasis for rectal cancer', British journal of cancer, vol. 106, no. 11, pp.1735-1741.

SEER Stat screenshot taken during an analysis task.

Downloads

Published

17.05.2023

How to Cite

M. R., M. ., & Poongodi , T. . (2023). An Effective Data Analysis for Lymphomas Outcomes and Risk Levels using Combined Advanced Data Mining Algorithms. International Journal of Intelligent Systems and Applications in Engineering, 11(6s), 500–515. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/2875

Issue

Section

Research Article