A New Gaussian Kernel FCM Technique for High-Dimensional Information in Real Datasets

Authors

  • Rakesh Kumar Godi Department of Computer Science and Engineering, Manipal Institute of Technology Bengaluru, Manipal Academy of Higher Education, Manipal, India
  • Mule Shrishail Basvant Associate Professor, Department of Electronics & Telecommunication Engineering, Sinhgad College of Engineering, Pune-41
  • Ramesh Babu Pittala Associate Professor, Department of CSE, Koneru Lakshmaiah Education Foundation, Vaddeswaram (KL Deemed to be University), Guntur–522302, A.P., India
  • A. Deepak Saveetha School of Engineering, Saveetha Institute of Medical and Technical Sciences, Saveetha University, Chennai, Tamilnadu
  • Arun Pratap Srivastava Lloyd Institute of Engineering & Technology, Greater Noida
  • Akhil Sankhyan Lloyd Law College, Greater Noida
  • Anurag Shrivastava Saveetha School of Engineering, Saveetha Institute of Medical and Technical Sciences, Chennai, Tamilnadu

Keywords:

Fuzzy c-means method, Gaussian kernel FCM (FCM-GM), Clustering

Abstract

Fuzzy C-Means (FCMs) is a well-known unsupervised partitioning algorithm that is utilized in a variety of applications, including pattern recognition, machine learning, & data mining. The membership values considering each individual computed during each of the clusters cannot reflect how effectively the individuals are categorized, despite FCM's strong performance in discovering clusters. FCM is the most well-known fuzzy clustering algorithm. Although FCM has done a good job regarding detecting clusters, membership values before each element allocated toward each cluster cannot tell us how well the individuals are clustered regarding each variable. A variant of the FCM algorithm for multidimensional data has been developed to solve this problem. The proposed method tells us that variables are not correlated & the data, as well as their weighted counterparts, are linearly separable. This hypothesis ignores the fact that each variable has a varied relevant weight, which may differ from one cluster to the next. In this paper, we present two multivariate FCM algorithms for multidimensional data among weighting. Weights are used to express the relative importance of each variable considering each cluster and to improve clustering quality. Experiments on synthetic and actual data sets reveal that the suggested method generates good clustering quality.

Downloads

Download data is not yet available.

References

Ramaswamy, R. Rastogi, S. Kyuseok, “Efficient algorithms for mining outliers from large data sets”, Proceedings ACM SIDMOD International Conference on Management of Data, pp. 427-438, 2000.

M. M. Breunig, H.-P. Kriegel, R.T. Ng, J. Sander, “LOF: Identifying density-based local outliers”, In Proceedings of the 2000 ACM SIGMOD international conference on Management of Data, pp. 93-104, 2000.

L. E. Agustin-Blas, S. Salcedo-Sanz, S. Jimenez-Fernandez, L. Carro-Calvo, J. Del Ser, and J. A. Portilla-Figueras, “A new grouping genetic algorithm for clustering problems,” Expert Systems with Application, vol. 39, no. 10, pp. 9695–9703, 2012.

H. Zhang and J. Lu, “Semi-supervised fuzzy clustering: a kernel-based approach,” Knowledge-Based Systems, vol. 22, no. 6, pp. 477–481, 2009.

Z. Zhang, J. Lin, and R. Miao, L. Zhou, “Novel distance and similarity measures on hesitant fuzzy linguistic term sets with application to pattern recognition,” Journal of Intelligent & Fuzzy Systems, vol. 37, no. 2, pp. 2981-2990, 2019.

L. Zhen, “Modeling of yard congestion and optimization of yard template in container ports,” Transportation Research Part B Methodological, vol. 90, pp. 83-104, 2016.

W. Li and J. Li, “Improvement of semi-supervised kernel clustering algorithm based on multi-factor stock selection,” Statistics and Information Forum, vol. 33, no. 3, pp. 30–36, 2018.

L. Zhen, “Tactical berth allocation under uncertainty,” European Journal of Operational Research, vol. 247, no. 3, pp. 928–944, 2015.

E.N. Nasibov, G. Ulutagay, “A new unsupervised approach for fuzzy clustering”, Fuzzy Sets and Systems, vol. 158, no. 19, pp. 2118–2133, 2007.

L. Chiu Stephen, “A cluster estimation method with extension to fuzzy model identification,” In Proceedings of 1994 IEEE Conference on Control Applications Part 2, pp. 1240–1245, Orlando, FL, USA, August 1994.

S. Ramathilagam and Y.-M. Huang, “Extended Gaussian kernel version of fuzzy c-means in the problem of data analyzing,” Expert Systems with Applications, vol. 38, no. 4, pp. 3793–3805, 2011.

X. Qian and L. Yao, “Extended incremental fuzzy clustering algorithm for sparse high-dimensional big data,” Computer Engineering, vol. 45, no. 6, pp. 75–81, 2019.

D. Ying, X. Ying, and J. Ye, “A novel clustering algorithm based on graph theory,” Computer Engineering and Application, vol. 45, no. 3, pp. 47–50, 2009.

Y. Xue and X. Sha, “On gray prediction model based on an improved FCM algorithm,” Statistics and Decision, vol. 9, pp. 29–32, 2017.

Emre Gungor and Ahmet Ozmen, “Distance and density based clustering algorithm using gaussian kernel”, Expert Systems with Applications, vol. 69, pp. 10-20, 2017. https://doi.org/10.1016/j.eswa.2016.10.022.

Hao Li, Xiaojie Liu, Tao Li and Rundong Gan, “A novel density-based clustering algorithm using nearest neighbour graph”, Pattern Recognition, vol. 102, pp. 1-48, 2020. https://doi.org/10.1016/j.patcog.2020.107206.

Shuling Yang, Kangshun Li, Zhengping Liang, Wei Li and Yu Xue, “A novel cluster validity index for fuzzy C-means algorithm”, Soft Computing, vol. 22, pp. 921-1931, 2018. https://link.springer.com/article/10.1007/s00500-016-2453-y.

Hui Yu, Lu Yuan Chen, Jing Tao Yao and Xing Nan Wang “A three-way clustering method based on an improved DBSCAN algorithm” PhysicaA, vol. 535, pp. 1-14, 2019.https://doi.org/10.1016/j.physa.2019.122289.

Rika Sharma and KesariVerma, “Fuzzy shared nearest neighbor clustering”, International Journal of fuzzy systems, vol. 21, pp. 2667–2678, 2019. https://link.springer.com/article/10.1007/s40815-019-00699-7#:~:text=The%20FSNN%20clustering%20method%20consists,with%20weights%20greater%20than%20zero

Shrivastava, A., Chakkaravarthy, M., Shah, M.A., A new machine learning method for predicting systolic and diastolic blood pressure using clinical characteristics. In Healthcare Analytics, 2023, 4, 10021

Shrivastava, A., Chakkaravarthy, M., Shah, M.A.,Health Monitoring based Cognitive IoT using Fast Machine Learning Technique. In International Journal of Intelligent Systems and Applications in Engineering, 2023, 11(6s), pp. 720–72

Shrivastava, A., Rajput, N., Rajesh, P., Swarnalatha, S.R., IoT-Based Label Distribution Learning Mechanism for Autism Spectrum Disorder for Healthcare Application. In Practical Artificial Intelligence for Internet of Medical Things: Emerging Trends, Issues, and Challenges, 2023, pp. 305–321

Anurag Shrivastava, S. J. Suji Prasad, Ajay Reddy Yeruva, P. Mani, Pooja Nagpal & Abhay Chaturvedi (2023): IoT Based RFID Attendance Monitoring System of Students using Arduino ESP8266 & Adafruit.io on Defined Area, Cybernetics and Systems.

M. Ester, H.-P. Kriegel, J. Sander, X. Xu, “A density-based algorithm for discovering clusters in large spatial databases with noise”, KDD, pp. 226–231, 1996.

A. Hinneburg, D. A. Keim, “An efficient approach to clustering in large multimedia databases with noise”, In Proceedings of the 4th International Conference on Knowledge Discovery and Data Mining (KDD-98), pp. 58–65, 1998.

Issam Dagher, “Fuzzy clustering using multiple gaussian kernels with optimized-parameters”, Fuzzy Optimization and Decision Making, vol. 17, 159–176, 2018. https://link.springer.com/article/10.1007/s10700-017-9268-x.

Y. Duan and G. Wang, “A FCM clustering algorithm based on polygonal fuzzy numbers to describe multiple attribute index information,” Systems Engineering-8eory and Practice, vol. 36, no. 12, pp. 3220–3228, 2016.

V. Barnett, T. Lewis, “Outliers in statistical data”, John Wiley, 2nd Edition, 1994.

H. Huang, K. Chang, H. Yu et al., “Research on adaptive entropy weight fuzzy C-means clustering algorithm,” Systems Engineering-8eory & Practice, vol. 36, no. 1, pp. 219–223, 2016.

W. Li and J. Li, “Improvement of semi-supervised kernel clustering algorithm based on multi-factor stock selection,” Statistics and Information Forum, vol.33, no. 3, pp. 30-36, 2018.

E.N. Nasibova, G. Ulutagay, “Robustness of density-based clustering methods with various neighborhood relations”, Fuzzy Sets and Systems, vol. 160, no. 24, pp. 3601-3615, 2009.

L. Lelis and J. Sander, “Semi-supervised density-based clustering”, In Proceedings of the 9th IEEE International Conference on Data Mining (ICDM), pp. 842–847, 2009.

E.M. Knorr, R.T. Ng, “Algorithms for mining distance based outliers in large datasets”, Proceedings of the 24th International Conferenc, New York, NY, pp. 392-403, 1998.

E. M. Knorr, R.T. Ng, “Finding intensional knowledge of distance-based outliers”, Proc. 25th International Conference on Very Large Data Bases, Edinburgh, Scotland, pp. 211-222, 1999.

H. Zhang and J. Wang, “Improved fuzzy C-means clustering algorithm based on selecting initial clustering center,” Computer Science, vol. 36, no. 6, pp. 206–208, 2009.

Downloads

Published

24.03.2024

How to Cite

Godi, R. K. ., Basvant, M. S. ., Pittala, R. B. ., Deepak, A. ., Srivastava, A. P. ., Sankhyan, A. ., & Shrivastava, A. . (2024). A New Gaussian Kernel FCM Technique for High-Dimensional Information in Real Datasets. International Journal of Intelligent Systems and Applications in Engineering, 12(18s), 65–75. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/4952

Issue

Section

Research Article

Most read articles by the same author(s)

1 2 3 4 5 > >>