Enhanced Performance of Hadoop Parameters Using Hybrid Meta Heuristics Optimization Techniques

Authors

  • Nandita Yambem, A. N. Nandakumar

Keywords:

grass hopper swarm optimization (GSO), Hybrid meta-heuristics, Bat algorithm, mapreduce task

Abstract

With Hadoop becoming the most popular open big data processing platform, various approaches have been proposed to achieve maximal performance gain for big data applications. But the influence of various performance tuning parameters on the overall application speedup is no-linear and it is also dependent on the application/data characteristics. This work models the problem of finding the optimal values for tuning parameters as a search optimization problem and proposes a hybrid meta heuristics solution to problem based on combining grass hopper swarm optimization with bat algorithm. The hybrid algorithm has good exploration and exploitation ability so that the optimal solution is found without getting into local minimal problem

Downloads

Download data is not yet available.

References

B. Nicolae, D. Moise, G. Antoniu, and al. BlobSeer: Bringing high throughput under heavy concurrency to Hadoop Map/Reduce applications. In Procs of the 24th IPDPS 2010, 2010. In press

A. Verma, L. Cherkasova, and R. Campbell. Resource Provisioning Framework for MapReduce Jobs with Performance Goals. ACM/IFIP/USENIX Middleware, pages 165–186, 2011.

M. Zaharia, A. Konwinski, A. D. Joseph, R. H. Katz, and I. Stoica. Improving mapreduce performance in heterogeneous environments. In USENIX Symposium on Operating Systems Design and Implementation (OSDI), volume 8, page 7, 2008

Qi Zhang, “PRISM: Fine-Grained Resource-Aware Scheduling for MapReduce”, 2015 IEEE

ZhenhuaGuo, Geoffrey Fox, Mo Zhou, Investigation of Data Locality in MapReduce, Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012), p.419-426, May 13-16, 2012 [doi>10.1109/CCGrid.2012.42]

Adam Crume, Joe Buck, Carlos Maltzahn, Scott Brandt, Compressing Intermediate Keys between Mappers and Reducers in SciHadoop, Proceedings of the 2012 SC Companion: High Performance Computing, Networking Storage and Analysis, p.7-12, November 10-16, 2012 [doi>10.1109/SC.Companion.2012.12]

Y. Chen, A. Ganapathi, and R. H. Katz, “To compress or not to compress-compute vs. io tradeoffs for mapreduce energy efficiency,” in Proceedings of the first ACM SIGCOMM workshop on Green networking. ACM, 2010, pp. 23–28.

W. Yu, Y. Wang, X. Que, and C. Xu, “Virtual shuffling for efficient data movement in mapreduce,” IEEE Transactions on Computers, vol. 64, no. 2, pp. 556–568, 2015

G. Ruan, H. Zhang, and B. Plale, “Exploiting mapreduce and data compression for data-intensive applications,” in Proceedings of the Conference on Extreme Science and Engineering Discovery Environment: Gateway to Discovery. ACM, 2013, pp. 1–8.

D. Moise, T.-T.-L. Trieu, L. Boug´e, and G. Antoniu, “Optimizing intermediate data management in mapreduce computations,” in Proceedings of the first international workshop on cloud computing platforms. ACM, 2011, pp. 1–7.

Veiga, Jorge & Expósito, Roberto &Taboada, Guillermo & Touriño, Juan. (2018). Enhancing in-memory efficiency for MapReduce-based data processing. Journal of Parallel and Distributed Computing. 120. 10.1016/j.jpdc.2018.04.001.

Chen, Xiang & Liang, Yi & Li, Guang-Rui& Chen, Cheng & Liu, Si-Yu. (2017). Optimizing Performance of Hadoop with Parameter Tuning. ITM Web of Conferences. 12. 03040. 10.1051/itmconf/20171203040.

Maria Malik, Hassan Ghasemzadeh, Tinoosh Mohsenin, Rosario Cammarota, Liang Zhao, AvestaSasan, Houman Homayoun, and SetarehRafatirad. 2019. ECoST: Energy-Efficient Co-Locating and Self-Tuning MapReduce Applications. In Proceedings of the 48th International Conference on Parallel Processing (ICPP 2019).

C, K. and X, A. (2020), Task failure resilience technique for improving the performance of MapReduce in Hadoop. ETRI Journal, 42: 748-760.

Liao G., Datta K., Willke T.L. (2013) Gunther: Search-Based Auto-Tuning of MapReduce. In: Wolf F., Mohr B., an Mey D. (eds) Euro-Par 2013 Parallel Processing. Euro-Par 2013. Lecture Notes in Computer Science, vol 8097. Springer, Berlin, Heidelberg.

Bhaskar, Archana&Ranjan, Rajeev. (2019). Optimized memory model for hadoop map reduce framework. International Journal of Electrical and Computer Engineering (IJECE). 9. 4396. 10.11591/ijece.v9i5.pp4396-4407.

S. Kumar, S. Padakandla, L. Chandrashekar, P. Parihar, K. Gopinath and S. Bhatnagar, "Scalable Performance Tuning of Hadoop MapReduce: A Noisy Gradient Approach," 2017 IEEE 10th International Conference on Cloud Computing (CLOUD), Honolulu, CA, 2017, pp. 375-382, doi: 10.1109/CLOUD.2017.55

Engineering.purdue.edu/~puma/datasets.htm

S. Saremi, S. Mirjalili, and A. Lewis, ``Grasshopper optimisation algorithm: Theory and application,'' Adv. Eng. Softw., vol. 105, pp. 30_47, Mar. 2017

Mukhtaj Khan, Zhengwen Huang, Maozhen Li, Gareth A. Taylor, Phillip M. Ashton, Mushtaq Khan, "Optimizing Hadoop Performance for Big Data Analytics in Smart Grid", Mathematical Problems in Engineering, vol. 2017, Article ID 2198262, 11 pages, 2017.

S. Lee, J.-Y. Jo, and Y. Kim, ‘‘Hadoop performance analysis model with deep data locality,’’ Information, vol. 10, no. 7, p. 222, Jun. 2019

A. Eldouh, H. Elkadi, and M. Khafagy, ‘‘Reducing data shuffling and improving MapReduce performance using enhanced data locality,’’ in Proc. IASTEM Int. Conf., 2019, pp. 5

Downloads

Published

01.04.2024

How to Cite

A. N. Nandakumar, N. Y. . (2024). Enhanced Performance of Hadoop Parameters Using Hybrid Meta Heuristics Optimization Techniques. International Journal of Intelligent Systems and Applications in Engineering, 12(3), 1508–1513. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/5544

Issue

Section

Research Article