×

Feature-weighted clustering with inner product induced norm based dissimilarity measures: an optimization perspective. (English) Zbl 1460.62098

Summary: The performance of a clustering algorithm can be improved by assigning appropriate weights to different features of the data. Such feature weighting is likely to reduce the effect of noise and irrelevant features while enhancing the effect of the discriminative features simultaneously. For the clustering purpose, feature-weighted dissimilarity measures are so far limited to Euclidean, Mahalanobis, and exponential distances. In this article, we introduce a novel feature weighting scheme for the general class of inner product induced norm (IPIN) based weighted dissimilarity measures. This class has a wide range and includes the three above-mentioned distances as special cases. We develop the general algorithms to solve the hard (\(k\)-means) and fuzzy (fuzzy \(c\)-means) partitional clustering problems and undertake in-depth analyses of the convergence of the algorithms as well. In addition, we address issues like feasibility and uniqueness of the solutions of these problems in sufficient details. The novelty of the article lies in the introduction of a general feature weighting scheme for the generalized class of IPIN-based dissimilarity measures and a complete convergence analysis for the automated feature-weighted clustering algorithms using such measures.

MSC:

62H30 Classification and discrimination; cluster analysis (statistical aspects)
62H86 Multivariate analysis and fuzziness
68T05 Learning and adaptive systems in artificial intelligence

Software:

UCI-ml; OVWTRE
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Anderberg, M. R. (2014). Cluster analysis for applications: Probability and mathematical statistics: A series of monographs and textbooks (Vol. 19). London: Academic Press.
[2] Bandyopadhyay, S., & Maulik, U. (2002). Genetic clustering for automatic evolution of clusters and application to image classification. Pattern Recognition, 35(6), 1197-1208. · Zbl 1001.68926 · doi:10.1016/S0031-3203(01)00108-X
[3] Bandyopadhyay, S., & Pal, S. K. (2007). Classification and learning using genetic algorithms: Applications in bioinformatics and web intelligence. Berlin: Springer. · Zbl 1138.68486
[4] Banerjee, A., Merugu, S., Dhillon, I. S., & Ghosh, J. (2005). Clustering with Bregman divergences. The Journal of Machine Learning Research, 6, 1705-1749. · Zbl 1190.62117
[5] Berkhin, P. (2006). A survey of clustering data mining techniques. In J. Kogan, C. Nicholas, & M. Teboulle (Eds.), Grouping multidimensional data (pp. 25-71). Berlin: Springer.
[6] Bezdek, J. C. (1973). Cluster validity with fuzzy sets. Journal of Cybernetics, 3(3), 58-73. · Zbl 0294.68035 · doi:10.1080/01969727308546047
[7] Bezdek, J. C. (1981). Pattern recognition with fuzzy objective function algorithms. Norwell, MA: Kluwer. · Zbl 0503.68069 · doi:10.1007/978-1-4757-0450-1
[8] Bezdek, J. C., & Hathaway, R. J. (2003). Convergence of alternating optimization. Neural, Parallel & Scientific Computations, 11(4), 351-368. · Zbl 1063.90051
[9] Chaomurilige, Y. J., & Yang, M. S. (2015). Analysis of parameter selection for Gustafson-Kessel fuzzy clustering using Jacobian matrix. IEEE Transactions on Fuzzy Systems, 23(6), 2329-2342. · doi:10.1109/TFUZZ.2015.2421071
[10] Dave, R. N. (1996). Validating fuzzy partitions obtained through c-shells clustering. Pattern Recognition Letters, 17(6), 613-623. · doi:10.1016/0167-8655(96)00026-8
[11] De Soete, G. (1988). OVWTRE: A program for optimal variable weighting for ultrametric and additive tree fitting. Journal of Classification, 5(1), 101-104. · doi:10.1007/BF01901677
[12] DeSarbo, W. S., Carroll, J. D., Clark, L. A., & Green, P. E. (1984). Synthesized clustering: A method for amalgamating alternative clustering bases with differential weighting of variables. Psychometrika, 49(1), 57-78. · Zbl 0594.62067 · doi:10.1007/BF02294206
[13] Dunn, J. C. (1973). A fuzzy relative of the isodata process and its use in detecting compact well-separated clusters. Journal of Cybernetics, 3(3), 32-57. · Zbl 0291.68033 · doi:10.1080/01969727308546046
[14] D’Urso, P., Massari, R., De Giovanni, L., & Cappelli, C. (2016). Exponential distance-based fuzzy clustering for interval-valued data. Fuzzy Optimization and Decision Making. doi:10.1007/s10700-016-9238-8. · Zbl 1428.62306
[15] Fitzpatrick, P. (2006). Advanced calculus (Vol. 5). Providence: American Mathematical Society. · Zbl 1229.26001
[16] Gath, I., & Geva, A. B. (1989). Unsupervised optimal fuzzy clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11(7), 773-780. · Zbl 0709.62592 · doi:10.1109/34.192473
[17] Golub, G. H., & Van Loan, C. F. (2012). Matrix computations (Vol. 3). Baltimore: JHU Press. · Zbl 0559.65011
[18] Gustafson, D., & Kessel, W. (1978). Fuzzy clustering with a fuzzy covariance matrix. In 1978 IEEE conference on decision and control including the 17th symposium on adaptive processes (No. 17, pp. 761-766).
[19] Huang, J. Z., Ng, M. K., Rong, H., & Li, Z. (2005). Automated variable weighting in k-means type clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(5), 657-668. · doi:10.1109/TPAMI.2005.95
[20] Hubert, L., & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2(1), 193-218. · Zbl 0587.62128 · doi:10.1007/BF01908075
[21] Hung, W.-L., Yang, M.-S., & Hwang, C.-M. (2011). Exponential-distance weighted k-means algorithm with spatial constraints for color image segmentation. In 2011 international conference on multimedia and signal processing (CMSP) (Vol. 1, pp. 131-135). IEEE. · Zbl 0546.62037
[22] Jain, A. K., Murty, M. N., & Flynn, P. J. (1999). Data clustering: A review. ACM Computing Surveys (CSUR), 31(3), 264-323. · doi:10.1145/331499.331504
[23] Keller, A., & Klawonn, F. (2000). Fuzzy clustering with weighting of data variables. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 8(06), 735-746. · Zbl 0986.68943 · doi:10.1142/S0218488500000538
[24] Klawonn, F., & Höppner, F. (2003). What is fuzzy about fuzzy clustering? Understanding and improving the concept of the fuzzifier. In International symposium on intelligent data analysis (pp. 254-264). Springer. · Zbl 0291.68033
[25] Krishnapuram, R., & Kim, J. (1999). A note on the Gustafson-Kessel and adaptive fuzzy clustering algorithms. IEEE Transactions on Fuzzy Systems, 7(4), 453-461. · doi:10.1109/91.784208
[26] Lichman, M. (2013). UCI machine learning repository. Irvine, CA: University of California, School of Information and Computer Science. http://archive.ics.uci.edu/ml.
[27] Liu, H.-C., Jeng, B.-C., Yih, J.-M., & Yu, Y.-K. (2009a). Fuzzy c-means algorithm based on standard Mahalanobis distances. In Proceedings of the 2009 international symposium on information processing (pp. 422-427).
[28] Liu, H.-C., Yih, J.-M., Lin, W.-C., & Wu, D.-B. (2009b). Fuzzy c-means algorithm based on common Mahalanobis distances. Journal of Multiple-Valued Logic & Soft Computing, 15, 581-595.
[29] Liu, H.-C., Yih, J.-M., & Liu, S.-W. (2007a). Fuzzy c-means algorithm based on Mahalanobis distances and better initial values. In Proceedings of the 10th joint conference and 12th international conference on fuzzy theory and technology (Vol. 1, pp. 1398-1404). Singapore: World Scientific.
[30] Liu, H.-C., Yih, J.-M., Sheu, T.-W., & Liu, S.-W. (2007b). A new fuzzy possibility clustering algorithms based on unsupervised Mahalanobis distances. In 2007 international conference on machine learning and cybernetics (Vol. 7, pp. 3939-3944). IEEE. · Zbl 0473.62049
[31] Lloyd, S. P. (1982). Least squares quantization in PCM. IEEE Transactions on Information Theory, 28(2), 129-137. · Zbl 0504.94015 · doi:10.1109/TIT.1982.1056489
[32] Lumelsky, V. J. (1982). A combined algorithm for weighting the variables and clustering in the clustering problem. Pattern Recognition, 15(2), 53-60. · Zbl 0473.62049 · doi:10.1016/0031-3203(82)90001-2
[33] MacQueen, J., et al. (1967). Some methods for classification and analysis of multivariate observations. In Proceedings of the fifth Berkeley symposium on mathematical statistics and probability (Vol. 1, pp. 281-297), Oakland, CA, USA. · Zbl 0214.46201
[34] Makarenkov, V., & Legendre, P. (2001). Optimal variable weighting for ultrametric and additive trees and k-means partitioning: Methods and software. Journal of Classification, 18(2), 245-271. · Zbl 1040.91087
[35] Mao, J., & Jain, A. K. (1996). A self-organizing network for hyperellipsoidal clustering (HEC). IEEE Transactions on Neural Networks, 7(1), 16-29. · doi:10.1109/72.478389
[36] Modha, D. S., & Spangler, W. S. (2003). Feature weighting in k-means clustering. Machine Learning, 52(3), 217-237. · Zbl 1039.68111 · doi:10.1023/A:1024016609528
[37] Munkres, J. R. (2000). Topology (2nd ed.). Upper Saddle River, NJ: Prentice Hall. · Zbl 0951.54001
[38] Nazari, M., Shanbehzadeh, J., & Sarrafzadeh, A. (2013). Fuzzy c-means based on automated variable feature weighting. In Proceedings of the international multiconference of engineers and computer scientists (Vol. 1, pp. 25-29).
[39] Olmsted, J. M. H. (1961). Advanced calculus. Upper Saddle River, NJ: Prentice Hall. · Zbl 0131.29401
[40] Ostrovsky, R., Rabani, Y., Schulman, L. J., & Swamy, C. (2012). The effectiveness of Lloyd-type methods for the k-means problem. Journal of the ACM (JACM), 59(6), 28. · Zbl 1281.68229 · doi:10.1145/2395116.2395117
[41] Saha, A., & Das, S. (2015a). Automated feature weighting in clustering with separable distances and inner product induced norms-A theoretical generalization. Pattern Recognition Letters, 63, 50-58. · doi:10.1016/j.patrec.2015.06.001
[42] Saha, A., & Das, S. (2015b). Categorical fuzzy k-modes clustering with automated feature weight learning. Neurocomputing, 166, 422-435. · doi:10.1016/j.neucom.2015.03.037
[43] Saha, A., & Das, S. (2016a). Geometric divergence based fuzzy clustering with strong resilience to noise features. Pattern Recognition Letters, 79, 60-67. · doi:10.1016/j.patrec.2016.04.013
[44] Saha, A., & Das, S. (2016b). Optimizing cluster structures with inner product induced norm based dissimilarity measures: Theoretical development and convergence analysis. Information Sciences, 372, 796-814. · Zbl 1428.62297 · doi:10.1016/j.ins.2016.08.058
[45] Selim, S. Z., & Ismail, M. A. (1984). K-means-type algorithms: A generalized convergence theorem and characterization of local optimality. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6(1), 81-87. · Zbl 0546.62037 · doi:10.1109/TPAMI.1984.4767478
[46] Sneath, P. H., Sokal, R. R., et al. (1973). Numerical taxonomy. The principles and practice of numerical classification. San Francisco, CA: WH Freeman. · Zbl 0285.92001
[47] Teboulle, M. (2007). A unified continuous optimization framework for center-based clustering methods. The Journal of Machine Learning Research, 8, 65-102. · Zbl 1222.68318
[48] Teboulle, M.; Berkhin, P.; Dhillon, I.; Guan, Y.; Kogan, J.; Teboulle, M. (ed.); Berkhin, P. (ed.); Dhillon, I. (ed.); Guan, Y. (ed.); Kogan, J. (ed.), Clustering with entropy-like k-means algorithms, 127-160 (2006), Berlin · doi:10.1007/3-540-28349-8_5
[49] Wölfel, M., & Ekenel, H. K. (2005). Feature weighted Mahalanobis distance: Improved robustness for Gaussian classifiers. In 2005 13th European signal processing conference (pp. 1-4). IEEE.
[50] Wu, J., Xiong, H., Liu, C., & Chen, J. (2012). A generalization of distance functions for fuzzy c-means clustering with centroids of arithmetic means. IEEE Transactions on Fuzzy Systems, 20(3), 557-571. · doi:10.1109/TFUZZ.2011.2179659
[51] Xie, X. L., & Beni, G. (1991). A validity measure for fuzzy clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 13(8), 841-847. · doi:10.1109/34.85677
[52] Yeung, K. Y., & Ruzzo, W. L. (2001). Details of the adjusted rand index and clustering algorithms, supplement to the paper “An empirical study on principal component analysis for clustering gene expression data”. Bioinformatics, 17(9), 763-774.
[53] Zangwill, W. I. (1969). Nonlinear programming: A unified approach (Vol. 196). Englewood Cliffs, NJ: Prentice-Hall. · Zbl 0195.20804
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.