Reinforcement mSVM: An Efficient Clustering and Classification Approach using reinforcement and supervised Techniques

Authors

  • Satish S. Banait Department of Computer Engineering, KKWIEER, Nashik, SPPU Pune, India
  • S. S. Sane Department of Computer Engineering, KKWIEER, Nashik, SPPU Pune, India
  • Dipak Bage Department of Computer Engineering, KKWIEER, Nashik, SPPU Pune, India
  • A.R. Ugale Department of Computer Engineering,MET’s Institute of Engineering,Nashik,SPPU,India

Keywords:

Big data mining, efficient clustering methodology, Unsupervised Learning Technique, Data mining, k-means

Abstract

Data mining as well as big data analytics represent approaches for analysing and extracting useful secret data. Although big data is complicated and large in volume, conventional methods to interpretation and retrieval do not function well. Data clustering is a common data mining approach that divides nodes into categories and makes it possible to retrieve features out of these groups. Conventional clustering techniques, including such k-means clustering as well as hierarchical clustering, are inefficient because the reliability of the groups they generate is harmed. As a result, an efficient and relatively extensible clustering technique is required. In this paper we proposed novel similarity-based clustering techniques on large unstructured transaction dataset. The HDFS file system log data has collected from real time Virtual Machine’s (VM’s) and generates the clusters, using reinforcement learning technique. Initially data has collected from various VM’s and proposed dimensionality reduction technique has used for data reduction. The Q-learning based reinforcement learning algorithm has applied on generated event. The activation function calculates the current weight for each transaction according to reward and penalty. Finally, the threshold-based entropy function generates a final cluster. After the clustering process modified Support vector Machine (mSVM) as supervised classifier has applied on entire label data.  In the extensive experimental analysis, we evaluate proposed model performance with existing techniques. The proposed clustering and classification method beats the comparable models in terms of average operating time and average clustered error, according to tests conducted on actual, synthetic, and automatically created datasets.

Downloads

Download data is not yet available.

References

Mohamed Aymen Ben HajKacem, Chiheb-Eddine Ben N and Nadia Essoussi. “KP-S: A Spark-based Design of the K-Prototypes Clustering for Big Data”, 2017, ACS 14th International Conference on Computer Systems and Applications, IEEE

Ms. Tejaswini U. Mane. “Smart heart disease prediction system using Improved K-Means and ID3 on Big Data”, 2017, International Conference on Data Management, Analytics and Innovation (ICDMAI), IEEE

Sheela Gole and Bharat Tidke. “Frequent Itemset Mining for Big Data in social media using ClustBigFIM algorithm”, 2015, International Conference on Pervasive Computing (ICPC), IEEE

Jian Yin, Zhi-Fang Tan, Jiang-Tao Ren and Yi-Qun Chen. “An Efficient Clustering algorithm For Mixed Type Attributes In Large Dataset”, 2005, Proceedings of the Fourth International Conference on Machine Learning and Cybernetics, Guangzhou, IEEE

Ahmed Elgohary and Mohamed A. Ismail. “Efficient Data Clustering Over Peer-to-Peer Networks”, 2011, IEEE

Yu-Fang Zhang, Jia-Li Mao and Zhong-Yang Xiong. “An Efficient Clustering algorithm”, 2003, Proceedings of the Second International Conference on Machine Learning and Cybernetics, Xi’an, IEEE

Sheng-Yi Jiang and W-Ming Xu. “An Efficient Clustering algorithm”, 2004, Proceedtags of the Third International Conference on Machine Learning and Cybernetics, Shanghai, IEEE

Sheng-Yi Jiang and Xia Li. “A Hybrid Clustering Algorithm”, 2009, Sixth International Conference on Fuzzy Systems and Knowledge Discovery, IEEE

Hui Zhang, Bin Pang, Ke Xie, and Hui Wu. “An Efficient Algorithm for Clustering Search Engine Results”, 2006, IEEE

Rasim Alguliyev, Ramiz Aliguliyev, Adil Bagirov and Rafael Karimov. “Batch Clustering Algorithm for Big Data Sets”, 2020, IEEE

Carlos Ordonez, Sikder Tahsin Al-Amin and Ladjel Bellatreche. “An ER-Flow Diagram for Big Data”, 2020, International Conference on Big Data (Big Data), IEEE

jian Yin and Dongfang Zhao. “Data Confidentiality Challenges in Big Data Applications”, 2015, International Conference on Big Data (Big Data), IEEE

Luo Xiaofeng and Luo Jing. “Research on Big Data Reference Architecture Model”, 2020, International Conference on Artificial Intelligence and Big Data, IEEE

Ikbal Taleb and Mohamed Adel Serhani. “Big Data Pre-Processing: Closing the Data Quality Enforcement Loop”, 2017, 6th International Congress on Big Data, IEEE

Sarma, C. A. ., S. . Inthiyaz, B. T. P. . Madhav, and P. S. . Lakshmi. “An Inter Digital- Poison Ivy Leaf Shaped Filtenna With Multiple Defects in Ground for S-Band Bandwidth Applications”. International Journal on Recent and Innovation Trends in Computing and Communication, vol. 10, no. 8, Aug. 2022, pp. 55-66, doi:10.17762/ijritcc.v10i8.5668.

David Becker and Bill McMullen. “Big Data, Big Data Quality Problem”, 2015, International Conference on Big Data (Big Data), IEEE

Ernesto Damiani. “Toward Big Data Risk Analysis”, 2015, International Conference on Big Data (Big Data), IEEE

Bhagyashri S. Gandhi and Leena A. Deshpande. “The Survey on Approaches to Efficient Clustering and Classification Analysis of Big Data”, 2020, IEEE

Disha D N, Sowmya B J, Chetan and Dr. Seema S. “An Efficient Framework of Data Mining and its Analytics on Massive Streams of Big Data Repositories”, 2016, IEEE

Zakaria Gheid and Yacine Challal. “Efficient and Privacy-Preserving k-means clustering For Big Data Mining”, 2016, IEEE TrustCom/BigDataSE/ISPA

Maryam Abdullah, Fawaz S. Al-Anzi and Salah Al-Sharhan. “Efficient Fuzzy Techniques for Medical Data Clustering”, 2017, 9th IEEE-GCC Conference and Exhibition (GCCCE), IEEE

Doaa.Sayed, Sherine.Rady and Mostafa.Aref. “Enhancing CluStream Algorithm for Clustering Big Data Streaming over Sliding Window”, 2020, IEEE

Louis Y. Y Lu and John S. Liu. “The major research themes of big data literature:”, 2016, International Conference on Computer and Information Technology, IEEE

Xinxin Huang and Shu Gong. “Analysis of Big-Data Based Data Mining Engine”, 2017, 13th International Conference on Computational Intelligence and Security, IEEE

Galina Chernyshova, Gennady Smorodin and Alexey Ovchinnikov. “Technique of Cluster Validity for Text Mining”, 2016, IEEE

M. Omair Shafiq. “Event Segmentation using MapReduce based Big Data Clustering”, 2016, International Conference on Big Data (Big Data), IEEE

Dr. Anu Saini, Jagrit Minocha, Jaypriya Ubriani and Dhruv Sharma. “New Approach for Clustering of Big Data: DisK-Means”, 2016, International Conference on Computing, Communication and Automation (ICCCA), IEEE

Seema Maitrey and C.K. Jha. “Handling Big Data Efficiently by using Map Reduce Technique”, 2015, International Conference on Computational Intelligence & Communication Technology, IEEE

Dajung Lee, Alric Althoff, Dustin Richmond and Ryan Kastner. “A Streaming Clustering Approach Using a Heterogeneous System for Big Data Analysis”, 2017, IEEE

Suyash Mishra and Dr Anuranjan Misra. “Structured and Unstructured Big Data Analytics”, 2017, International Conference on Current Trends in Computer, Electrical, Electronics and Communication (ICCTCEEC), IEEE

Ni Bin. “Research on Methods and Techniques for IoT Big Data Cluster Analysis”, 2018, International Conference on Information Systems and Computer Aided Education (ICISCAE), IEEE

S. Dhanasekaran, R. Sundarrajan, B. S. Murugan, S. Kalaivani and V. Vasudevan. “Enhanced Map Reduce Techniques for Big Data Analytics based on K-Means Clustering”, 2019, IEEE

Agarwal, D. A. . (2022). Advancing Privacy and Security of Internet of Things to Find Integrated Solutions. International Journal on Future Revolution in Computer Science &Amp; Communication Engineering, 8(2), 05–08. https://doi.org/10.17762/ijfrcsce.v8i2.2067

Ankita Saldhi, Abhinav Goel, Dipesh Yadav, Ankur Saldhi, Dhruv Saksena and S. Indu. “Big Data Analysis Using Hadoop Cluster”, 2014, IEEE

Saurabh Arora and Inderveer Chana. “A Survey of Clustering Techniques for Big Data Analysis”, 2014, IEEE

G. Anuradha and Bidisha Roy. “Suggested Techniques for Clustering and Mining of Data Streams”, 2014, International Conference on Circuits, Systems, Communication and Information Technology Applications (CSCITA), IEEE

Doaa.Sayed, Sherine.Rady and Mostafa.Aref. “Enhancing CluStream Algorithm for Clustering Big Data Streaming over Sliding Window”, 2020, IEEE

Charalampos Chelmis, Jahanvi Kolte and Viktor K. Prasanna. “Big Data Analytics for Demand Response: Clustering Over Space and Time”, 2015, International Conference on Big Data (Big Data), IEEE

Ishwank Singh, A Sai Sabitha and Abhay Bansal. “Student Perfoemance Analysis Using Clustering Algorithm”, 2016, IEEE

Giannis Spiliopoulos, Konstantinos Chatzikokolakis, Dimitrios Zissis, Evmorfia Biliri, Dimitrios Papaspyros, Giannis Tsapelas and Spyros Mouzakitis. “Knowledge extraction from maritime spatiotemporal data: An evaluation of clustering algorithms on Big Data”, 2017, International Conference on Big Data (BIGDATA), IEEE

Fadia Alaeddin, Ala’ Khalifeh and Khalid A. Darabkh. “An Overview on Big Data Mining Using Evolutionary Techniques”, 2020, International Conference on Innovation and Intelligence for Informatics, Computing and Technologies (3ICT), IEEE

Linda R. Musser. (2020). Older Engineering Books are Open Educational Resources. Journal of Online Engineering Education, 11(2), 08–10. Retrieved from http://onlineengineeringeducation.com/index.php/joee/article/view/41

Jungkyu Han and Min Luo. “Bootstrapping K-means for Big data analysis”, 2014, International Conference on Big Data, IEEE

Neha Bharill, Aruna Tiwari and Aayushi Malviya. “Fuzzy Based Scalable Clustering Algorithms for Handling Big data using Apache Spark”, 2016, IEEE

Sunil Kumar and Maninder Singh. “A Novel Clustering Technique for Efficient Clustering of Big Data in Hadoop Ecosystem”, 2019, IEEE

Kumar, S., Gornale, S. S., Siddalingappa, R., & Mane, A. (2022). Gender Classification Based on Online Signature Features using Machine Learning Techniques. International Journal of Intelligent Systems and Applications in Engineering, 10(2), 260–268. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/2020

Garima, Hina Gulati and P.K.Singh. “Clustering Techniques in Data Mining: A Comparison”, 2019, IEEE

Proposed system architecture for dimensionality reduction and classification using machine learning

Downloads

Published

15.10.2022

How to Cite

[1]
S. S. . Banait, S. S. . Sane, D. . Bage, and A. Ugale, “Reinforcement mSVM: An Efficient Clustering and Classification Approach using reinforcement and supervised Techniques”, Int J Intell Syst Appl Eng, vol. 10, no. 1s, pp. 78–89, Oct. 2022.