Enhanced Performance of Hadoop Parameters Using Hybrid Meta Heuristics Optimization Techniques
Keywords:
grass hopper swarm optimization (GSO), Hybrid meta-heuristics, Bat algorithm, mapreduce taskAbstract
With Hadoop becoming the most popular open big data processing platform, various approaches have been proposed to achieve maximal performance gain for big data applications. But the influence of various performance tuning parameters on the overall application speedup is no-linear and it is also dependent on the application/data characteristics. This work models the problem of finding the optimal values for tuning parameters as a search optimization problem and proposes a hybrid meta heuristics solution to problem based on combining grass hopper swarm optimization with bat algorithm. The hybrid algorithm has good exploration and exploitation ability so that the optimal solution is found without getting into local minimal problem
Downloads
References
B. Nicolae, D. Moise, G. Antoniu, and al. BlobSeer: Bringing high throughput under heavy concurrency to Hadoop Map/Reduce applications. In Procs of the 24th IPDPS 2010, 2010. In press
A. Verma, L. Cherkasova, and R. Campbell. Resource Provisioning Framework for MapReduce Jobs with Performance Goals. ACM/IFIP/USENIX Middleware, pages 165–186, 2011.
M. Zaharia, A. Konwinski, A. D. Joseph, R. H. Katz, and I. Stoica. Improving mapreduce performance in heterogeneous environments. In USENIX Symposium on Operating Systems Design and Implementation (OSDI), volume 8, page 7, 2008
Qi Zhang, “PRISM: Fine-Grained Resource-Aware Scheduling for MapReduce”, 2015 IEEE
ZhenhuaGuo, Geoffrey Fox, Mo Zhou, Investigation of Data Locality in MapReduce, Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012), p.419-426, May 13-16, 2012 [doi>10.1109/CCGrid.2012.42]
Adam Crume, Joe Buck, Carlos Maltzahn, Scott Brandt, Compressing Intermediate Keys between Mappers and Reducers in SciHadoop, Proceedings of the 2012 SC Companion: High Performance Computing, Networking Storage and Analysis, p.7-12, November 10-16, 2012 [doi>10.1109/SC.Companion.2012.12]
Y. Chen, A. Ganapathi, and R. H. Katz, “To compress or not to compress-compute vs. io tradeoffs for mapreduce energy efficiency,” in Proceedings of the first ACM SIGCOMM workshop on Green networking. ACM, 2010, pp. 23–28.
W. Yu, Y. Wang, X. Que, and C. Xu, “Virtual shuffling for efficient data movement in mapreduce,” IEEE Transactions on Computers, vol. 64, no. 2, pp. 556–568, 2015
G. Ruan, H. Zhang, and B. Plale, “Exploiting mapreduce and data compression for data-intensive applications,” in Proceedings of the Conference on Extreme Science and Engineering Discovery Environment: Gateway to Discovery. ACM, 2013, pp. 1–8.
D. Moise, T.-T.-L. Trieu, L. Boug´e, and G. Antoniu, “Optimizing intermediate data management in mapreduce computations,” in Proceedings of the first international workshop on cloud computing platforms. ACM, 2011, pp. 1–7.
Veiga, Jorge & Expósito, Roberto &Taboada, Guillermo & Touriño, Juan. (2018). Enhancing in-memory efficiency for MapReduce-based data processing. Journal of Parallel and Distributed Computing. 120. 10.1016/j.jpdc.2018.04.001.
Chen, Xiang & Liang, Yi & Li, Guang-Rui& Chen, Cheng & Liu, Si-Yu. (2017). Optimizing Performance of Hadoop with Parameter Tuning. ITM Web of Conferences. 12. 03040. 10.1051/itmconf/20171203040.
Maria Malik, Hassan Ghasemzadeh, Tinoosh Mohsenin, Rosario Cammarota, Liang Zhao, AvestaSasan, Houman Homayoun, and SetarehRafatirad. 2019. ECoST: Energy-Efficient Co-Locating and Self-Tuning MapReduce Applications. In Proceedings of the 48th International Conference on Parallel Processing (ICPP 2019).
C, K. and X, A. (2020), Task failure resilience technique for improving the performance of MapReduce in Hadoop. ETRI Journal, 42: 748-760.
Liao G., Datta K., Willke T.L. (2013) Gunther: Search-Based Auto-Tuning of MapReduce. In: Wolf F., Mohr B., an Mey D. (eds) Euro-Par 2013 Parallel Processing. Euro-Par 2013. Lecture Notes in Computer Science, vol 8097. Springer, Berlin, Heidelberg.
Bhaskar, Archana&Ranjan, Rajeev. (2019). Optimized memory model for hadoop map reduce framework. International Journal of Electrical and Computer Engineering (IJECE). 9. 4396. 10.11591/ijece.v9i5.pp4396-4407.
S. Kumar, S. Padakandla, L. Chandrashekar, P. Parihar, K. Gopinath and S. Bhatnagar, "Scalable Performance Tuning of Hadoop MapReduce: A Noisy Gradient Approach," 2017 IEEE 10th International Conference on Cloud Computing (CLOUD), Honolulu, CA, 2017, pp. 375-382, doi: 10.1109/CLOUD.2017.55
Engineering.purdue.edu/~puma/datasets.htm
S. Saremi, S. Mirjalili, and A. Lewis, ``Grasshopper optimisation algorithm: Theory and application,'' Adv. Eng. Softw., vol. 105, pp. 30_47, Mar. 2017
Mukhtaj Khan, Zhengwen Huang, Maozhen Li, Gareth A. Taylor, Phillip M. Ashton, Mushtaq Khan, "Optimizing Hadoop Performance for Big Data Analytics in Smart Grid", Mathematical Problems in Engineering, vol. 2017, Article ID 2198262, 11 pages, 2017.
S. Lee, J.-Y. Jo, and Y. Kim, ‘‘Hadoop performance analysis model with deep data locality,’’ Information, vol. 10, no. 7, p. 222, Jun. 2019
A. Eldouh, H. Elkadi, and M. Khafagy, ‘‘Reducing data shuffling and improving MapReduce performance using enhanced data locality,’’ in Proc. IASTEM Int. Conf., 2019, pp. 5
Downloads
Published
How to Cite
Issue
Section
License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.