A New Approach to Determine Eps Parameter of DBSCAN Algorithm

Authors

DOI:

https://doi.org/10.18201/ijisae.2017533899

Keywords:

AE-DBSCAN, Clustering, Data Mining, Density-Based Clustering

Abstract

In recent years, data analysis has become important with increasing data volume. Clustering, which groups objects according to their similarity, has an important role in data analysis. DBSCAN is one of the most effective and popular density-based clustering algorithm and has been successfully implemented in many areas. However, it is a challenging task to determine the input parameter values of DBSCAN algorithm, which are neighborhood radius, Eps, and minimum number of points, MinPts. The values of these parameters significantly affect clustering performance of the algorithm. In this study, we propose AE-DBSCAN algorithm, which includes a new method to determine the value of neighborhood radius Eps automatically. The experimental evaluations showed that the proposed method outperformed the analytical DBSCAN.

Downloads

Download data is not yet available.

Author Biographies

Fatma Ozge Ozkok, Erciyes University

Department of Computer Engineering

Mete Celik, Erciyes University

Department of Computer Engineering

References

M. Ester, H.-P. Kriegel, and X. Xu "A density-based algorithm for discovering clusters in large spatial databases with noise," in Proc. KDD, Oregon, USA, 1996, pp. 226-231.

X. P. Yu, D. Zhou, and Y. Zhou, “A New Clustering Algorithm Based on Distance and Density,” in Proc. ICSSSM, Chongquing, China, 2005, pp. 1016-1021.

S. K. Popat, and M. Emmanuel, "Review and Comparative Study of Clustering Techniques," Int. J. of Computer Science and Information Technologies, vol. 5, no.1, pp. 805–12, 2014.

P. Liu, D. Zhou, and N. J. Wu,“VDBSCAN: Varied density based spatial clustering of applications with noise,” in Proc. ICSSSM, Chengdu, China, 2007, pp 1-4.

K. Khan, S. U. Rehman, K. Aziz, S. Fong and S. Sarasvady, "DBSCAN: Past, present and future." in Proc. ICADIWT, Bangalore, India, 2014, pp. 232-238.

A. Ram, S. Jalal, A. S. Jalal, and M. Kumar "A density based algorithm for discovering density varied clusters in large spatial databases," Int. J. of Computer Applications, vol. 3, no. 6, pp. 1-4, 2010.

A.K. Jain, M.N. Murty, and P.J. Flynn, "Data Clustering: A Review," ACM Computing Surveys, vol. 31, no. 3, pp. 264-323, 1999.

D. Birant and A. Kut, “ST-DBSCAN: An algorithm for clustering spatial-temporal data,” Data & Knowledge Engineering, vol. 60, no. 1, pp. 208–221, 2007.

M. Celik, F. Dadaser-Celik, and A. Dokuz, “Anomaly detection in temperature data using dbscan algorithm,” in Proc. INISTA, Istanbul, Turkey, 2011, pp. 91–95.

P. N. Tan, M. Steinbach, and V. Kumar, "Introduction to Data Mining," Boston Addison-Wesley, April 2005.

G. Sheikholeslami, S. Chatterjee, and A. Zhang, "Wave Cluster: A multi-resolution clustering approach for very large spatial databases," in Proc. VLDB, San Francisco, CA, 1998, pp.428-439.

G. Sudipto, R. Rastogi and K. Shim, "CURE: An efficient clustering algorithm for large Databases," in Proc. ACM SIGMOD, Seattle, WA, 1998, pp.73-84.

T. Zhang, R. Ramakrishnan, and M. Livny, “BIRCH: An efficient data clustering method for very large databases,” in Proc. ACM SIGMOD, 1996, pp. 103–114.

W. Wang, J. Yang, and R. R. Muntz, “STING: A statistical information grid approach to spatial data mining,” in Proc VLDB, San Francisco, CA, USA, 1997, pp. 186–195.

M. Halkidi, Y. Batistakis, and M. Varzirgiannis, “On clustering validation techniques,” J. of Intelligent Information Systems, vol. 17, no. 2-3, pp. 107–145, 2001.

Karypis, G., Han, E.H., and Kumar, V.: “Chameleon: A Hierarchical Clustering Algorithm Using Dynamic Modeling,” IEEE Computer, vol. 32, no. 8, pp 68-75, August 1999.

Z. Chen and Y. F. Li, "Anomaly detection based on enhanced dbscan algorithm", Procedia Engineering, vol. 15, pp. 178-182, 2011.

H. Zhou, P. Want, and H. Li, "Research on adaptive parameters determination in DBSCAN algorithm," J. of Information & Computational Science, vol. 9, no. 7, pp. 1967-1973, 2012.

A. R. Chowdhury, M. E. Mollah, and M. A. Rahman, "An efficient method for subjectively choosing parameter k automatically in VDBSCAN (varied density based spatial clustering of applications with noise) algorithm," in Proc. ICCAE, Singapore, 2010, pp. 38-41.

M. Daszykowski, B. Walczak, and D. L. Massart, "Looking for Natural Patterns in Data. Part 1: Density Based Approach," Chemometrics and Intelligent Laboratory Systems, vol. 56, no. 2, pp. 83-92, 2001.

Clustering datasets, Available: http://cs.uef.fi/sipu/datasets/. Accessed on: April 23, 2017.

Downloads

Published

12.12.2017

How to Cite

Ozkok, F. O., & Celik, M. (2017). A New Approach to Determine Eps Parameter of DBSCAN Algorithm. International Journal of Intelligent Systems and Applications in Engineering, 5(4), 247–251. https://doi.org/10.18201/ijisae.2017533899

Issue

Section

Research Article