×

GAPS: A clustering method using a new point symmetry-based distance measure. (English) Zbl 1122.68486

Summary: An evolutionary clustering technique is described that uses a new point symmetry-based distance measure. The algorithm is therefore able to detect both convex and non-convex clusters. Kd-tree based nearest neighbor search is used to reduce the complexity of finding the closest symmetric point. Adaptive mutation and crossover probabilities are used. The proposed GA with Point Symmetry (GAPS) distance based clustering algorithm is able to detect any type of clusters, irrespective of their geometrical shape and overlapping nature, as long as they possess the characteristic of symmetry. GAPS is compared with existing symmetry-based clustering technique SBKM, its modified version, and the well-known K-means algorithm. Sixteen data sets with widely varying characteristics are used to demonstrate its superiority. For real-life data sets, ANOVA and MANOVA statistical analyses are performed.

MSC:

68T05 Learning and adaptive systems in artificial intelligence
68W05 Nonnumerical algorithms
68T10 Pattern recognition, speech recognition

Software:

J-MEANS; GAPS; ANN
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] A.K. Jain, M. Murthy, P. Flynn, Data clustering: a review, ACM Comput. Rev. November (1999).; A.K. Jain, M. Murthy, P. Flynn, Data clustering: a review, ACM Comput. Rev. November (1999).
[2] Berg, M. D.; Kreveld, M. V.; Overmars, M.; Schwarzkopf, O., Cluster Analysis for Application (1973), Academic Press: Academic Press New York
[3] Duda, R. O.; Hart, P. E., Pattern Classification and Scene Analysis (1973), Wiley: Wiley New York · Zbl 0277.68056
[4] Tou, J. T.; Gonzalez, R. C., Pattern Recognition Principles (1974), Addison-Wesley: Addison-Wesley Reading, MA · Zbl 0299.68058
[5] Everitt, B. S.; Landau, S.; Leese, M., Cluster Analysis (2001), Arnold: Arnold London · Zbl 1205.62076
[6] Jain, A. K.; Dubes, R. C., Algorithms for Clustering Data (1988), Prentice-Hall: Prentice-Hall Englewood Cliffs, NJ · Zbl 0665.62061
[7] Ko’´vesi, B.; Boucher, J.; Saoodi, S., Stochastic k-means algorithm for vector quantization, Pattern Recognition Lett., 22, 603-610 (2001) · Zbl 1010.68906
[8] Anderberg, M. R., Computational Geometry: Algorithms and Applications (2000), Springer: Springer Berlin
[9] Kanungo, T.; Mount, D.; Netanyahu, N. S.; Piatko, C.; Silverman, R.; Wu, A., An efficient k-means clustering algorithm: analysis and implementation, IEEE Trans. Pattern Anal. Mach. Intell., 24, 7, 881-892 (2002)
[10] Likas, A.; Vlassis, N.; Verbeek, J. J., The global k-means clustering algorithm, Pattern Recognition, 36, 451-461 (2003)
[11] Charalampidis, D., A modified k-means algorithm for circular invariant clustering, IEEE Trans. Pattern Anal. Mach. Intell., 27, 12, 1856-1865 (2005)
[12] Dave, R. N.; Bhaswan, K., Adaptive fuzzy c-shells clustering and detection of ellipses, IEEE Trans. Neural Networks, 3, 5, 643-662 (1992)
[13] Krishnapuram, R.; Nasraoui, O.; Frigui, H., The fuzzy c-spherical shells algorithm: A new approach, IEEE Trans. Neural Networks, 3, 5, 663-671 (1992)
[14] E. Hartuv, R. Shamir, A clustering algorithm based on graph connectivity, Inform. Process. Lett. (2000) 75-181.; E. Hartuv, R. Shamir, A clustering algorithm based on graph connectivity, Inform. Process. Lett. (2000) 75-181. · Zbl 0996.68525
[15] Hansen, P.; Mladenovic, N., J-means: a new local search heuristic for minimum sum of squares clustering, Pattern Recognition, 34, 405-413 (2001) · Zbl 1012.68873
[16] Wong, C.; Chen, C.; Su, M., A novel algorithm for data clustering, Pattern Recognition, 34, 425-442 (2001) · Zbl 1012.68870
[17] P. Bradley, U. Fayyad, C. Reina, Scaling clustering algorithms to large databases, in: Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining, August (1998) 9-15.; P. Bradley, U. Fayyad, C. Reina, Scaling clustering algorithms to large databases, in: Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining, August (1998) 9-15.
[18] Zhang, T.; Ramakrishnan, R.; Livny, M., BIRCH: an efficient data clustering method for very large databases, (Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data (1996)), 103-114
[19] Yan, H., Fuzzy curve-tracing algorithm, IEEE Trans. Syst. Man Cybern. Part B, 31, 5, 768-780 (2001)
[20] Mikheev, A.; Vincent, L.; Faber, V., High-quality polygonal contour approximation based on relaxation, (Proceedings of Sixth International Conference on Document Analysis and Recognition (ICDAR’01) (2001), IEEE Computer Society: IEEE Computer Society Los Alamitos, CA, USA), 0361
[21] Yip, A. M.; Ding, C.; Chan, T. F., Dynamic cluster formation using level set methods, IEEE Trans. Pattern Anal. Mach. Intell., 28, 6, 877-889 (2006)
[22] Nock, R.; Nielsen, F., On weighting clustering, IEEE Trans. Pattern Anal. Mach. Intell., 28, 8, 1223-1235 (2006)
[23] Chen, Y.-L.; Hu, H.-L., An overlapping cluster algorithm to provide non-exhaustive clustering, Eur. J. Oper. Res., 173, 762-780 (2006) · Zbl 1120.62048
[24] J.C. Bezdek, Fuzzy mathematics in pattern classification, Ph.D. Thesis, 1973.; J.C. Bezdek, Fuzzy mathematics in pattern classification, Ph.D. Thesis, 1973.
[25] Beliakov, G.; King, M., Density based fuzzy c-means clustering of non-convex patterns, Eur. J. Oper. Res., 173, 717-728 (2006) · Zbl 1125.90048
[26] Kohonen, T., The ‘neural’ phonetic typewriter, IEEE Computer, 27, 3, 11-12 (1988)
[27] Kohonen, T., Self-organization and associative memory (1989), Springer: Springer New York, Berlin · Zbl 0528.68062
[28] Carpenter, G.; Grossberg, S., A massively parallel architecture for a self-organizing neural pattern recognition machine, Comput. Vision, Graphics, Image Proc., 37, 3, 54-115 (1987) · Zbl 0634.68089
[29] Carpenter, G.; Grossberg, S., ART2: self-organization of stable category recognition codes for analog input patterns, Appl. Opt., 26, 23, 4919-4930 (1987)
[30] Hertz, J.; Krogh, A.; Palmer, R. G., Introduction to the Theory of Neural Computation (1991), Addison-Wesley: Addison-Wesley Reading, MA
[31] Mao, J.; Jain, A. K., A self-organizing network for hyperellipsoidal clustering, IEEE Trans. Neural Networks, 7, 16-29 (1996)
[32] Su, M.-C.; DeClaris, N.; Kang Liu, T., Application of neural networks in cluster analysis, (Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, Computational Cybernetics and Simulation, vol. 1 (1997), Orlando, FL: Orlando, FL USA), 1-6
[33] Su, M.-C.; Liu, Y.-C., A new approach to clustering data with arbitrary shapes, Pattern Recognition, 38, 1887-1901 (2005) · Zbl 1077.68809
[34] Gath, I.; Geva, A., Unsupervised optimal fuzzy clustering, IEEE Trans. Pattern Anal. Mach. Intell., 11, 773-781 (1989)
[35] D. Gustafson, W. Kessel, Fuzzy clustering with a fuzzy covariance matrix, Proceedings of the IEEE Conference Decision Control, 1979, pp. 761-766.; D. Gustafson, W. Kessel, Fuzzy clustering with a fuzzy covariance matrix, Proceedings of the IEEE Conference Decision Control, 1979, pp. 761-766. · Zbl 0448.62045
[36] Dave, R. N., Use of the adaptive fuzzy clustering algorithm to detect lines in digital images, Intell. Robots Comput. Vision VIII, 1192, 600-611 (1989)
[37] Man, Y.; Gath, I., Detection and separation of ring-shaped clusters using fuzzy clustering, IEEE Trans. Pattern Anal. Mach. Intell., 16, 8, 855-861 (1994)
[38] Höppner, F., Fuzzy shell clustering algorithms in image processing: fuzzy c-rectangular and 2-rectangular shells, IEEE Trans. Fuzzy syst., 5, 599-613 (1997)
[39] Bandyopadhyay, S., An automatic shape independent clustering technique, Pattern Recognition, 37, 33-45 (2004)
[40] Attneave, F., Symmetry information and memory for pattern, Am. J. Psychol., 68, 209-222 (1995)
[41] Su, M.-C.; Chou, C.-H., A modified version of the k-means algorithm with a distance based on cluster symmetry, IEEE Trans. Pattern Anal. Mach. Intell., 23, 6, 674-680 (2001)
[42] Chou, C. H.; Su, M. C.; Lai, E., Symmetry as a new measure for cluster validity, (Second WSEAS International Conference on Scientific Computation and Soft Computing (2002)), 209-213
[43] Maulik, U.; Bandyopadhyay, S., Genetic algorithm based clustering technique, Pattern Recognition, 33, 1455-1465 (2000)
[44] Holland, J. H., Adaptation in Natural and Artificial Systems (1975), The University of Michigan Press: The University of Michigan Press AnnArbor
[45] Anderson, T. W.; Scolve, S., Introduction to the Statistical Analysis of Data (1978), Houghton Mifflin: Houghton Mifflin Boston, MA
[46] Anderson, T. W., An Introduction to Multivariate Statistical Analysis (1984), Wiley: Wiley New York · Zbl 0651.62041
[47] D.M. Mount, S. Arya, ANN: a library for approximate nearest neighbor searching, \(2005. \langle;\) http://www.cs.umd.edu/\( \sim;\rangle;\); D.M. Mount, S. Arya, ANN: a library for approximate nearest neighbor searching, \(2005. \langle;\) http://www.cs.umd.edu/\( \sim;\rangle;\)
[48] Srinivas, M.; Patnaik, L., Adaptive probabilities of crossover and mutation in genetic algorithms, IEEE Trans. Syst. Man Cybern., 24, 4, 656-667 (1994)
[49] Bentley, J. L.; Weide, B. W.; Yao, A. C., Optimal expected-time algorithms for closest point problems, ACM Trans. Math. Software, 6, 4, 563-580 (1980) · Zbl 0441.68077
[50] Friedman, J. H.; Bently, J. L.; Finkel, R. A., n algorithm for finding best matches in logarithmic expected time, ACM Trans. Math. Software, 3, 3, 209-226 (1977) · Zbl 0364.68037
[51] Bandyopadhyay, S.; Maulik, U., Genetic clustering for automatic evolution of clusters and application to image classification, Pattern Recognition, 2, 1197-1208 (2002) · Zbl 1001.68926
[52] Bensaid, A. M.; Hall, L. O.; Bezdek, J. C.; Clarke, L. P.; Silbiger, M. L.; Arrington, J. A.; Murtagh, R. F., Validity-guided reclustering with applications to image segmentation, IEEE Trans. Fuzzy Syst., 4, 2, 112-123 (1996)
[53] http://www.ics.uci.edu/\( \sim;\); http://www.ics.uci.edu/\( \sim;\)
[54] Ben-Hur, A.; Guyon, I., Detecting Stable Clusters using Principal Component Analysis in Methods in Molecular Biology (2003), Humana press: Humana press Clifton, UK
[55] Fisher, R. A., The use of multiple measurements in taxonomic problems, Ann. Eugen., 3, 179-188 (1936)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.