Developing and validating a Distributed Computing Framework for Big Data Analytics
Keywords:
Developing, validating, Computing Framework, Big Data.Abstract
The unprecedented growth in data volume and complexity has necessitated the evolution of advanced computing frameworks capable of handling Big Data analytics efficiently. This research focuses on the development and validation of a distributed computing framework tailored to the challenges posed by large-scale data analytics. The proposed framework aims to enhance scalability, fault tolerance, and performance, addressing the unique requirements of processing massive datasets. The research begins with an in-depth review of existing distributed computing frameworks and identifies their strengths and limitations in the context of Big Data analytics. Drawing on insights from this analysis, a novel framework is designed, incorporating innovative strategies to optimize data distribution, parallel processing, and fault recovery mechanisms. The architecture integrates both batch and real-time processing capabilities, ensuring versatility in handling diverse analytical workloads. To validate the efficacy of the proposed framework, a series of experiments are conducted using representative Big Data sets from various domains. Performance metrics such as processing speed, resource utilization, and scalability are measured and compared against established benchmarks. Additionally, the framework is subjected to stress testing scenarios to evaluate its robustness under adverse conditions. The research explores the integration of machine learning algorithms within the distributed framework to enable predictive analytics and enhance decision-making capabilities. The adaptability of the framework to different machine learning models is assessed, and its impact on overall system performance is analyzed. The validation results demonstrate that the proposed distributed computing framework exhibits significant improvements in terms of processing speed, scalability, and fault tolerance compared to existing solutions. The findings highlight its potential to address the challenges posed by Big Data analytics and its suitability for deployment in real-world applications. This research contributes to the field of distributed computing by presenting a novel framework specifically tailored for Big Data analytics. The comprehensive validation process establishes its effectiveness and reliability, opening avenues for further research and practical implementation in industries and research domains dealing with massive datasets.
Downloads
References
A. AL-Jumaili, R. Muniyandi, M. Hasan, J. Siaw Paw, M. Singh, “Big Data Analytics Using Cloud Computing Based Frameworks for Power Management Systems: Status, Constraints, and Future Recommendations" Sensors 23, no. 6: 2952, 2023
F. Ashkouti, K. Khamforoosh, “A distributed computing model for big data anonymization in the networks”. PLOS ONE 18(4): e0285212, 2023.
G. Bhathal, & A. Singh, “Big Data Computing with Distributed Computing Frameworks”, 10.1007/978-981-13-3765-9_49, 2019.
Hosseini, K. Kiani, “A big data driven distributed density based hesitant fuzzy clustering using Apache spark with application to gene expression microarray”. Eng. Appl. Artif. Intell, 79, 100–113, 2019.
S. Karim, T. Soomro, S. Burney, “Spatiotemporal aspects of big data. Applied Computer Systems”, 23 (2), 90–100. doi:10.2478/acss-2018-0012, 2018.
F. Martínez–Álvarez, A. Morales–Esteban, “Big data and natural disasters: new approaches for spatial and temporal massive data analysis”. 129, 38–39, 2019.
S. Mazumder, R. Bhadoria, G. Deka, “Distributed computing in big data analytics. In InCon-Cepts, Technologies and Applications”, Springer: New York, NY, USA, 2017.
P. Natesan, E. Sathishkumar, S. Mathivanan, M. Venkatasen, P. Jayagopal, S. Allayear, “A Distributed Framework for Predictive Analytics Using Big Data and MapReduce Parallel Programming, Mathematical Problems in Engineering”, vol. 2023, Article ID 6048891, 10 pages, 2023.
S. Niu, “Research on the application of machine learning big data mining algorithms in digital signal processing. In Proceedings of the 2021 IEEE Asia-Pacific Conference on Image Processing”, Electronics and Computers (IPEC), Dalian, China, 14–16; pp. 776–779, 2021.
A. Olasz, N. Binh, D. Kristof, “Development of a New Framework for Distributed Processing of Geospatial Big Data”. International Journal of Spatial Data Infrastructures Research. 1212. 85-111. 10.2902/1725-0463.2017.12.art5, 2017.
Z. Rashid, S. Zebari, K. Sharif, K. Jacksi, “Distributed Cloud Computing and Distributed Parallel Computing: A Review”. In Proceedings of the ICOASE 2018-International Conference on Advanced Science and Engineering, Duhok, Iraq, 9–11; pp. 167–172, 2018
P. Sweetline G. Suseendran, “Cloud computing and big data: A comprehensive analysis”. J. Crit. Rev, 7, 185–189, 2020.
R. Zhong, C. Xu, C. Chen, G. Huang, “Big Data Analytics for Physical Internet-based intelligent 10 manufacturing shop floors”. Int. J. Prod. Res. 7543, 1–12, 2015.
Downloads
Published
How to Cite
Issue
Section
License
![Creative Commons License](http://i.creativecommons.org/l/by-sa/4.0/88x31.png)
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.