Developing and validating a Distributed Computing Framework for Big Data Analytics

Authors

  • Ahmad Ali Khalifah Al-Zoubi

Keywords:

Developing, validating, Computing Framework, Big Data.

Abstract

The unprecedented growth in data volume and complexity has necessitated the evolution of advanced computing frameworks capable of handling Big Data analytics efficiently. This research focuses on the development and validation of a distributed computing framework tailored to the challenges posed by large-scale data analytics. The proposed framework aims to enhance scalability, fault tolerance, and performance, addressing the unique requirements of processing massive datasets. The research begins with an in-depth review of existing distributed computing frameworks and identifies their strengths and limitations in the context of Big Data analytics. Drawing on insights from this analysis, a novel framework is designed, incorporating innovative strategies to optimize data distribution, parallel processing, and fault recovery mechanisms. The architecture integrates both batch and real-time processing capabilities, ensuring versatility in handling diverse analytical workloads. To validate the efficacy of the proposed framework, a series of experiments are conducted using representative Big Data sets from various domains. Performance metrics such as processing speed, resource utilization, and scalability are measured and compared against established benchmarks. Additionally, the framework is subjected to stress testing scenarios to evaluate its robustness under adverse conditions. The research explores the integration of machine learning algorithms within the distributed framework to enable predictive analytics and enhance decision-making capabilities. The adaptability of the framework to different machine learning models is assessed, and its impact on overall system performance is analyzed. The validation results demonstrate that the proposed distributed computing framework exhibits significant improvements in terms of processing speed, scalability, and fault tolerance compared to existing solutions. The findings highlight its potential to address the challenges posed by Big Data analytics and its suitability for deployment in real-world applications. This research contributes to the field of distributed computing by presenting a novel framework specifically tailored for Big Data analytics. The comprehensive validation process establishes its effectiveness and reliability, opening avenues for further research and practical implementation in industries and research domains dealing with massive datasets.

Downloads

Download data is not yet available.

References

A. AL-Jumaili, R. Muniyandi, M. Hasan, J. Siaw Paw, M. Singh, “Big Data Analytics Using Cloud Computing Based Frameworks for Power Management Systems: Status, Constraints, and Future Recommendations" Sensors 23, no. 6: 2952, 2023

F. Ashkouti, K. Khamforoosh, “A distributed computing model for big data anonymization in the networks”. PLOS ONE 18(4): e0285212, 2023.

G. Bhathal, & A. Singh, “Big Data Computing with Distributed Computing Frameworks”, 10.1007/978-981-13-3765-9_49, 2019.

Hosseini, K. Kiani, “A big data driven distributed density based hesitant fuzzy clustering using Apache spark with application to gene expression microarray”. Eng. Appl. Artif. Intell, 79, 100–113, 2019.

S. Karim, T. Soomro, S. Burney, “Spatiotemporal aspects of big data. Applied Computer Systems”, 23 (2), 90–100. doi:10.2478/acss-2018-0012, 2018.

F. Martínez–Álvarez, A. Morales–Esteban, “Big data and natural disasters: new approaches for spatial and temporal massive data analysis”. 129, 38–39, 2019.

S. Mazumder, R. Bhadoria, G. Deka, “Distributed computing in big data analytics. In InCon-Cepts, Technologies and Applications”, Springer: New York, NY, USA, 2017.

P. Natesan, E. Sathishkumar, S. Mathivanan, M. Venkatasen, P. Jayagopal, S. Allayear, “A Distributed Framework for Predictive Analytics Using Big Data and MapReduce Parallel Programming, Mathematical Problems in Engineering”, vol. 2023, Article ID 6048891, 10 pages, 2023.

S. Niu, “Research on the application of machine learning big data mining algorithms in digital signal processing. In Proceedings of the 2021 IEEE Asia-Pacific Conference on Image Processing”, Electronics and Computers (IPEC), Dalian, China, 14–16; pp. 776–779, 2021.

A. Olasz, N. Binh, D. Kristof, “Development of a New Framework for Distributed Processing of Geospatial Big Data”. International Journal of Spatial Data Infrastructures Research. 1212. 85-111. 10.2902/1725-0463.2017.12.art5, 2017.

Z. Rashid, S. Zebari, K. Sharif, K. Jacksi, “Distributed Cloud Computing and Distributed Parallel Computing: A Review”. In Proceedings of the ICOASE 2018-International Conference on Advanced Science and Engineering, Duhok, Iraq, 9–11; pp. 167–172, 2018

P. Sweetline G. Suseendran, “Cloud computing and big data: A comprehensive analysis”. J. Crit. Rev, 7, 185–189, 2020.

R. Zhong, C. Xu, C. Chen, G. Huang, “Big Data Analytics for Physical Internet-based intelligent 10 manufacturing shop floors”. Int. J. Prod. Res. 7543, 1–12, 2015.

Downloads

Published

24.03.2024

How to Cite

Ahmad Ali Khalifah Al-Zoubi. (2024). Developing and validating a Distributed Computing Framework for Big Data Analytics. International Journal of Intelligent Systems and Applications in Engineering, 12(3), 3764–3771. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/6053

Issue

Section

Research Article