×

Self-organizing map for symbolic data. (English) Zbl 1254.68234

Summary: Kohonen’s self-organizing map (SOM) is a competitive learning neural network that uses a neighborhood lateral interaction function to discover the topological structure hidden in the data set. It is an unsupervised learning which has both visualization and clustering properties. In general, the SOM neural network is constructed as a learning algorithm for numeric (vector) data. Although there are different SOM clustering methods for numeric data with real applications in the literature, there is less consideration in a SOM clustering for symbolic data. In this paper, we modify the SOM so that it can treat symbolic data and a so-called symbolic SOM (S-SOM) is then proposed. We first use novel structures to represent symbolic neurons. We then use a suppression concept to create a learning rule for neurons. Therefore, the S-SOM is created for treating symbolic data by embedding the novel structure and the suppression learning rule. Some real data sets are applied with the S-SOM. The experimental results show the feasibility and effectiveness of the proposed S-SOM in these real applications.

MSC:

68T05 Learning and adaptive systems in artificial intelligence

Software:

UCI-ml; SODAS
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Bezdek, J. C., Pattern Recognition with Fuzzy Objective Function Algorithms (1981), Plenum Press: Plenum Press New York · Zbl 0503.68069
[2] Billard, L.; Diday, E., From the statistics of data to the statistics of knowledge: symbolic data analysis, J. Am. Stat. Assoc., 98, 470-487 (2003)
[3] Billard, L., Brief overview of symbolic data and analytic issues, Stat. Anal. Data Min., 4, 149-156 (2011) · Zbl 07260274
[4] C.L. Blake, C.J. Merz, UCI Repository of Machine Learning Databases, A Huge Collection of Artificial and Real-world Data Sets, 1998. Available from: \( \langle;\) http://archive.ics.uci.edu/ml/datasets.html \(\rangle;\); C.L. Blake, C.J. Merz, UCI Repository of Machine Learning Databases, A Huge Collection of Artificial and Real-world Data Sets, 1998. Available from: \( \langle;\) http://archive.ics.uci.edu/ml/datasets.html \(\rangle;\)
[5] Bock, H. H., Visualizing symbolic data by Kohonen maps, (Diday, E.; Noihome-Fraiture, M., Symbolic Data Analysis and the SODAS Software (2008), Wiley), 205-234
[6] Chavent, M.; de Carvalho, F. A.T.; Lechevallier, Y.; Verde, R., New clustering methods for interval data, Comput. Stat., 21, 211-229 (2006) · Zbl 1114.62069
[7] Cheng, Y., Convergence and ordering and ordering of Kohoen’s batch map, Neural Comput., 9, 1667-1676 (1997)
[8] Chow, T. W.S.; Rahman, M. K.M., Multilayer SOM with tree-structured data for efficient document retrieval and plagiarism detection, IEEE Trans. Neural Networks, 20, 1385-1402 (2009)
[9] Conan-Guez, B.; Rossi, F.; El Golli, A., Fast algorithm and implementation of dissimilarity self-organizing maps, Neural Networks, 19, 855-863 (2006) · Zbl 1102.68540
[10] Cury, A.; Crémona, C.; Diday, E., Application of symbolic data analysis for structural modification assessment, Eng. Struct., 32, 762-775 (2010)
[11] de Carvalho, F. A.T., Fuzzy c-means clustering methods for symbolic interval data, Pattern Recognition Lett., 28, 423-437 (2007)
[12] de Carvalho, F. A.T.; Brito, P.; Bock, H. H., Dynamic clustering methods for interval data based on L2 distance, Comput. Stat., 21, 231-250 (2006) · Zbl 1114.62070
[13] de Carvalho, F. A.T.; de Souza, R. M.C. R., Unsupervised pattern recognition models for mixed feature-type symbolic data, Pattern Recognition Lett., 31, 430-443 (2010)
[14] de Carvalho, F. A.T.; de Souza, R. M.C. R.; Chavent, M.; Lechevallier, Y., Adaptive Hausdorff distances and dynamic clustering of symbolic interval data, Pattern Recognition Lett., 27, 167-179 (2006)
[15] de Carvalho, F. A.T.; Lechevallier, Y., Partitional clustering algorithms for symbolic interval data based on single adaptive distances, Pattern Recognition, 42, 1223-1236 (2009) · Zbl 1183.68527
[16] de Carvalho, F. A.T.; Lechevallier, Y., Dynamic clustering of interval-valued data based on adaptive quadratic distances, IEEE Trans. Syst. Man Cybern. A: Syst. Humans, 39, 1295-1306 (2009)
[17] de Carvalho, F. A.T.; Tenório, C. P., Fuzzy K-means clustering algorithms for interval-valued data based on adaptive quadratic distances, Fuzzy Sets Syst., 161, 2978-2999 (2010) · Zbl 1204.62106
[18] de Souza, R. M.C. R.; de Carvalho, F. A.T., Clustering of interval data based on city-block distances, Pattern Recognition Lett., 25, 353-365 (2004)
[19] Diday, E., The symbolic approach in clustering, (Bock, H. H., Classification and Related Methods of Data Analysis (1988), North-Holland) · Zbl 0268.62005
[20] El-Sonbaty, Y.; Ismail, M. A., Fuzzy clustering for symbolic data, IEEE Trans. Fuzzy Syst., 6, 195-204 (1998)
[21] Fan, J. L.; Zhen, W. Z.; Xie, W. X., Suppressed fuzzy c-means clustering algorithm, Pattern Recognition Lett., 24, 1607-1612 (2003) · Zbl 1048.68078
[22] Gowda, K. C.; Diday, E., Symbolic clustering using a new dissimilarity measure, Pattern Recognition, 24, 567-578 (1991)
[23] Gowda, K. C.; Diday, E., Symbolic clustering using a new similarity measure, IEEE Trans. Syst. Man Cybern., 22, 368-378 (1992)
[24] Grossberg, S., Adaptive pattern classification and universal recoding: I. Parallel development and coding of neural feature detectors, Biol. Cybern., 23, 121-134 (1976) · Zbl 0339.92004
[25] Guru, D. S.; Kiranagi, B. B., Multivalued type dissimilarity measure and concept of mutual dissimilarity value for clustering symbolic patterns, Pattern Recognition, 38, 151-256 (2005) · Zbl 1079.68605
[26] Guru, D. S.; Kiranagi, B. B.; Nagabhushan, P., Multivalued type proximity measure and concept of mutual similarity value useful for clustering symbolic patterns, Pattern Recognition Lett., 25, 1203-1213 (2004)
[27] Havens, T. C.; Keller, J. M.; Popescu, M., Computing with words with the ontological self-organizing map, IEEE Trans. Fuzzy Syst., 18, 473-485 (2010)
[28] Haykin, S., Neural Networks (1994), Prentice Hall: Prentice Hall New York · Zbl 0828.68103
[29] Hsu, C. C.; Lin, S. H.; Tai, W. S., Apply extended self-organizing map to cluster and classify mixed-type data, Neurocomputing, 74, 3832-3842 (2011)
[30] Hung, W. L.; Yang, M. S.; Chen, D. H., Parameter selection for suppressed fuzzy c-means with an application to MRI segmentation, Pattern Recognition Lett., 27, 424-438 (2006)
[31] Irpino, A.; Verde, R., Dynamic clustering of interval data using a Wasserstein-based distance, Pattern Recognition Lett., 29, 1648-1658 (2008)
[32] Kohonen, T., Self-organized formation of topologically correct feature maps, Biol. Cybern., 43, 59-69 (1982) · Zbl 0466.92002
[33] Kohonen, T., The self-organizing map, Proc. IEEE, 78, 1464-1480 (1990)
[34] Liao, W.; Chen, H.; Yang, Q.; Lei, X., Analysis of fMRI data using improved self-organizing mapping and spatio-temporal metric hierarchical clustering, IEEE Trans. Med. Imaging, 27, 1472-1483 (2008)
[35] R.P. Lippmann, An introduction to computing with neural nets, in: IEEE ASSP, 1987, pp. 4-22.; R.P. Lippmann, An introduction to computing with neural nets, in: IEEE ASSP, 1987, pp. 4-22.
[36] Lo, Z. P.; Bavarian, B., On the rate of convergence in topology preserving neural networks, Biol. Cybern., 65, 55-63 (1991) · Zbl 0731.92002
[37] D. Malerba, F. Esposito, V. Gioviale, V. Tamma, Comparing dissimilarity measures for symbolic data analysis, in: Proceedings of Techniques and Technologies for Statistics-Exchange of Technology and Know-How, Crete, 1, pp. 473-481, 2001.; D. Malerba, F. Esposito, V. Gioviale, V. Tamma, Comparing dissimilarity measures for symbolic data analysis, in: Proceedings of Techniques and Technologies for Statistics-Exchange of Technology and Know-How, Crete, 1, pp. 473-481, 2001.
[38] Mali, K.; Mitra, S., Clustering and its validation in a symbolic framework, Pattern Recognition Lett., 24, 2367-2376 (2003) · Zbl 1047.68132
[39] Michalski, R.; Stepp, R. E., Automated construction of classifications: conceptual clustering versus numerical taxonomy, IEEE Trans. Pattern Anal. Mach. Intell., 5, 396-410 (1983)
[40] Noirhomme-Fraiture, M.; Brito, P., Far beyond the classical data models: symbolic data analysis, Stat. Anal. Data Min., 4, 157-170 (2011) · Zbl 07260275
[41] Ordóñez, D.; Dafonte, C.; Arcay, B.; Manteiga, M., HSC: a multi-resolution clustering strategy in Self-Organizing Maps applied to astronomical observations, Appl. Soft Comput., 12, 204-215 (2012)
[42] Ripley, B. D., Pattern Recognition and Neural Networks (1996), Cambridge University Press · Zbl 0853.62046
[43] Ritter, H.; Schulten, K., On the stationary state of Kohonen’s self-organizing sensory mapping, Biol. Cybern., 54, 99-106 (1986) · Zbl 0586.92004
[44] Rosenblatt, F., The perceptron: a probabilistic model for information storage and organization in the brain, Psychol. Rev., 65, 386-408 (1958)
[45] Wu, K. L.; Yang, M. S., A fuzzy-soft learning vector quantization, Neurocomputing, 55, 681-697 (2003)
[46] Wu, K. L.; Yang, M. S., A cluster validity index for fuzzy clustering, Pattern Recognition Lett., 26, 1275-1291 (2005)
[47] Xie, X. L.; Beni, G., A validity measure for fuzzy clustering, IEEE Trans. Pattern Anal. Mach. Intell., 13, 841-847 (1991)
[48] Yang, M. S., A survey of fuzzy clustering, Math. Comput. Modeling, 18, 1-16 (1993) · Zbl 0800.68728
[49] Yang, M. S.; Hung, P. Y.; Chen, D. H., Fuzzy clustering algorithms for mixed feature variables, Fuzzy Sets Syst., 141, 301-317 (2004) · Zbl 1137.62350
[50] Yang, M. S.; Yang, J. H., A fuzzy-soft learning vector quantization for control chart pattern recognition, Int. J. Prod. Res., 40, 2721-2731 (2002)
[51] Analysis System of Symbolic Official (ASSO) data, SODAS2 software—SYKSOM algorithm, Available from: \( \langle;\) http://www.info.fundp.ac.be/asso/index.html \(\rangle;\); Analysis System of Symbolic Official (ASSO) data, SODAS2 software—SYKSOM algorithm, Available from: \( \langle;\) http://www.info.fundp.ac.be/asso/index.html \(\rangle;\)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.