Designing a Framework for Improving Container Scheduling and Load Balancing Using Deep Reinforcement Learning

Authors

  • Ravikanth M., Sampath Korra, Raviteja Kocherla, Shaik Hussain Shaik Ibrahim, Sreekanth Kottu

Keywords:

Duplicate detection, Record linkage, Data deduplication, Data integration, Records matching, rough sets.

Abstract

Duplication of records refers to the situation when a specific record appears in the database more than once. The record matching technique is utilized to determine which records correspond to the same real-world object. This is the biggest and trickiest assignment of the day. Duplicates in the database will needlessly lengthen the time it takes to perform queries and need additional power resources, among other things. Records deduplication is done in order to prevent these problems and their effects.This study proposes an efficient algorithm that matches records using a rough set approach. We treat these datasets using our records deduplication computing methodology. Ultimately, the dataset collection made up of distinct record entries is produced by the algorithm. We proposed two algorithms namely Compute Distinguishability Matrix (CDM) and Rough Sets based Record Deduplication (RSRD). The former is used to compute a matrix that is used in the latter for record deduplication. Subsequently, the findings of the experiment are offered which are prepared using common datasets, and the effectiveness is examined.

Downloads

Download data is not yet available.

References

M. Wheatley, ”Operation Clean Data”, CIO Asia Magazine, http://www.cio- asia.com, Aug 2004.

D. N. Vasundhara and M. Seetha, ”Rough-Set and Artificial Neural Networks

Based Image Classification”, IC3I (14 - 17)Dec, 2016.

N. Koudas, S. Sarawagi, and D. Srivastava, ”Record Linkage: Similarity Measures and Algorithms”, Proc. ACM SIGMOD International Conference on Management of Data, pp. 802-803, 2006.

Bhattacharya and L. Getoor, ”Iterative Record Linkage for Cleaning and Inte- gration”, Proc. 9th ACM SIGMOD Workshop Research Issues in Data Mining and Knowledge Discovery, pp. 11-18, 2004.

I.P. Fellegi and A.B. Sunter, ”A Theory for Record Linkage”, J. Am. Statistical Association, vol. 66, no. 1, pp. 1183-1210, 1969.

R. Bell and F. Dravis, ”Is You Data Dirty? and Does that Matter?”, Accenture Whiter Paper, http://www.accenture.com, 2006.

S. Lawrence, C.L. Giles, and K.D. Bollacker, ”Autonomous Citation Matching”, Proc. Third Int’l Conf. Autonomous Agents, pp. 392-393, 1999.

S. Lawrence, L. Giles, and K. Bollacker, ”Digital Libraries and Autonomous Cita- tion Indexing”, Computer, vol. 32, no. 6, pp. 67-71, June 1999.

Kuo-Si Huang, Chang-Biau Yang, and Kuo-Tsung Tseng, ”Fast Algorithms for Finding the Common Subsequence of Multiple Sequences”, National Science Coun- cil of the Republic of China, NSC-90-2213-E-110-015.

Surajit Chaudhuri, Kris Ganjam, Venkatesh Ganti, Rajeev Motwani, ”Robust and Efficient Fuzzy Match for Online Data Cleaning”, SIGMOD 2003, June 9-12, 2003, San Diego CA.

Y. Li, S.C.K. Shiu, S.K. Pal, J.N.K. Liu, ”A rough set-based case-based reasoner for text categorization”, International Journal of Approximate Reasoning 41 (2006) 229255.

Warren Shen, Pedro DeRose, Long Vu, AnHai Doan, Raghu Ramakrishnan, ”Source-aware Entity Matching: A Compositional Approach”, 2007 IEEE.

Mohamed Elhadi, Amjad Al-Tobi, ”Webpage Duplicate Detection Using CombinedPOS and Sequence Alignment Algorithm”, 2009 World Congress on Computer Science and Information Engineering, IEEE.

Mohamed Elhadi, Amjad Al-Tobi, ”Duplicate Detection in Documents and Web-Pages using Improved Longest Common Subsequence and Documents Syntactical Structures”, , ICCSCIT-2009, IEEE.

Weifeng Su, Jiying Wang, and Frederick H. Lochovsky, ”Record Matching overQuery Results from Multiple Web Databases”, IEEE TKDE. Vol. 22, NO. 4, 2010.

Moises G. de Carvalho, Alberto H.F. Laender, Marcos Andre Goncalves, and Al- tigran S. da Silva, ”A Genetic Programming Approach to Record Deduplication”, IEEE TKDE, Vol. 24, NO. 3, 2012.

D. Madusubram, S. P. ShanthaRajah, ”Performance Analysis of Query RelatedUser Profiling For Web Search”, PRIME, 21-22, 2013.

Shafi Ullah Khan, Shiyou Yang, Luyu Wang, Lei Liu, ”A Modified Particle Swarm Optimization Algorithm for Global Optimizations of Inverse problems”, 2015, IEEE Transactions on Magnetics.

Akhila K, Amal Ganesh, Sunitha C, ”A Study on Deduplication Techniques over Encrypted Data”, ICRTCSE, India, Procedia Computer Science 87 (2016) 38-43.

Rodel Miguel, Khin Mi Mi Aung, Rodel Miguel, Khin Mi Mi Aung, ”HEDup:Secure Deduplication with Homomorphic Encryption”, 2015 IEEE.

Xue Yang, Rongxing Lu, Jun Shao, Xiaohu Tang and Ali A. Ghorbani. (2020). Achieving Efficient Secure Deduplication With User-Defined Access Control in Cloud. IEEE. 19(1), pp.591 - 606. http://DOI:10.1109/TDSC.2020.2987793

PriteshkumarPrajapati and Parth Shah. (2022). A Review on Secure Data Deduplication: Cloud Storage Security Issue. Elsevier. 34(7), pp.3996-4007. https://doi.org/10.1016/j.jksuci.2020.10.021

JUN WOOK HEO, RAMACHANDRAN, ALI DORRI and RAJA JURDAK. (2024). Blockchain Data Storage Optimisations : A Comprehensive Survey. ACM, pp.1-26. https://doi.org/10.1145/3645104

Marek Grzegorowski, Andrzej Janusz, Dominik Sle¸zak and Marcin Szczuka. (2017). On the role of feature space granulation in feature selection processes. IEEE, pp.1-10. http://DOI:10.1109/BigData.2017.8258124

Geyao Cheng, Lailong Luo, Junxu Xia, Deke Guo and Yuchen Sun. (2023). When Deduplication Meets Migration: An Efficient and Adaptive Strategy in Distributed Storage Systems. IEEE. 34(10), pp.2749 - 2766. [Online]. Available at: http://DOI:10.1109/TPDS.2023.3299309

K.A. Vidhya and T.V. Geetha. (2018). Entity resolution framework using rough set blocking for heterogeneous web of data. Journal of Intelligent & Fuzzy Systems. 34, p.659–675. http://DOI:10.3233/JIFS-17946

Yunlong He, Hequn Xian, Liming Wang and Shuguang Zhang. (2020). Secure encrypted data deduplication based on data popularity. Springer, pp.1-10. https://doi.org/10.1007/s11036-019-01504-3

ubiao Wang, Junhao Wen, Xibin Wang, Bamei Tao and Wei Zhou. (2019). A cloud service trust evaluation model based on combining weights and gray correlation analysis. Hindawi. 2019, pp.1-12. https://doi.org/10.1155/2019/2437062

Di Zhang, Junqing Le, Nankun Mu, Jiahui Wu and Xiaofeng Liao. (2021). Secure and Efficient Data Deduplication in JointCloud Storage. IEEE. 11(1), pp.156 - 167. http://DOI:10.1109/TCC.2021.3081702

K.A.Vidhya and T.V.Geetha. (2017). Precision Based Rough Set Based Hybrid Recommender for Scalable Top-K Drugs. IJCSIS. 15(4), pp.403-411.

Downloads

Published

24.03.2024

How to Cite

Sampath Korra, Raviteja Kocherla, Shaik Hussain Shaik Ibrahim, Sreekanth Kottu, R. M. (2024). Designing a Framework for Improving Container Scheduling and Load Balancing Using Deep Reinforcement Learning. International Journal of Intelligent Systems and Applications in Engineering, 12(3), 1928–1934. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/5657

Issue

Section

Research Article