×

Exponential distance-based fuzzy clustering for interval-valued data. (English) Zbl 1428.62306

Summary: In several real life and research situations data are collected in the form of intervals, the so called interval-valued data. In this paper a fuzzy clustering method to analyse interval-valued data is presented. In particular, we address the problem of interval-valued data corrupted by outliers and noise. In order to cope with the presence of outliers we propose to employ a robust metric based on the exponential distance in the framework of the Fuzzy \(C\)-medoids clustering mode, the Fuzzy \(C\)-medoids clustering model for interval-valued data with exponential distance. The exponential distance assigns small weights to outliers and larger weights to those points that are more compact in the data set, thus neutralizing the effect of the presence of anomalous interval-valued data. Simulation results pertaining to the behaviour of the proposed approach as well as two empirical applications are provided in order to illustrate the practical usefulness of the proposed method.

MSC:

62H86 Multivariate analysis and fuzziness
62H30 Classification and discrimination; cluster analysis (statistical aspects)
68T10 Pattern recognition, speech recognition
PDFBibTeX XMLCite
Full Text: DOI Link

References:

[1] Anderson, DT; Bezdek, JC; Popescu, M; Keller, JM, Comparing fuzzy, probabilistic, and possibilistic partitions, IEEE Transactions on Fuzzy Systems, 18, 906-918, (2010) · doi:10.1109/TFUZZ.2010.2052258
[2] Campello, RJ; Hruschka, ER, A fuzzy extension of the silhouette width criterion for cluster analysis, Fuzzy Sets and Systems, 157, 2858-2875, (2006) · Zbl 1103.68674 · doi:10.1016/j.fss.2006.07.006
[3] Cazes, P; Chouakria, A; Diday, E; Schektrman, Y, Extension de l’analyse en composantes principales à des données de type intervalle, Revue de Statistique Appliquée, 45, 5-24, (1997)
[4] Coppi, R; D’Urso, P, Fuzzy k-means clustering models for triangular fuzzy time trajectories, Statistical Methods and Applications, 11, 21-40, (2002) · Zbl 1145.62347 · doi:10.1007/BF02511444
[5] Carvalho, FdAT; Lechevallier, Y, Partitional clustering algorithms for symbolic interval data based on single adaptive distances, Pattern Recognition, 42, 1223-1236, (2009) · Zbl 1183.68527 · doi:10.1016/j.patcog.2008.11.016
[6] Carvalho, FdAT; Tenório, CP, Fuzzy k-means clustering algorithms for interval-valued data based on adaptive quadratic distances, Fuzzy Sets and Systems, 161, 2978-2999, (2010) · Zbl 1204.62106 · doi:10.1016/j.fss.2010.08.003
[7] Carvalho, FdAT; Souza, RM; Chavent, M; Lechevallier, Y, Adaptive Hausdorff distances and dynamic clustering of symbolic interval data, Pattern Recognition Letters, 27, 167-179, (2006) · doi:10.1016/j.patrec.2005.08.014
[8] Denoeux, T; Masson, M, Multidimensional scaling of interval-valued dissimilarity data, Pattern Recognition Letters, 21, 83-92, (2000) · doi:10.1016/S0167-8655(99)00135-X
[9] Dey, V; Pratihar, DK; Datta, GL, Genetic algorithm-tuned entropy-based fuzzy c-means algorithm for obtaining distinct and compact clusters, Fuzzy Optimization and Decision Making, 10, 153-166, (2011) · doi:10.1007/s10700-011-9097-2
[10] Duarte Silva, AP; Brito, P, Discriminant analysis of interval data: an assessment of parametric and distance-based approaches, Journal of Classification, 32, 516-541, (2015) · Zbl 1331.62305 · doi:10.1007/s00357-015-9189-8
[11] D’Urso, P; Giovanni, L, Robust clustering of imprecise data, Chemometrics and Intelligent Laboratory Systems, 136, 58-80, (2014) · doi:10.1016/j.chemolab.2014.05.004
[12] D’Urso, P; Giordani, P, A least squares approach to principal component analysis for interval valued data, Chemometrics and Intelligent Laboratory Systems, 70, 179-192, (2004) · doi:10.1016/j.chemolab.2003.11.005
[13] D’Urso, P; Giordani, P, A robust fuzzy k-means clustering model for interval valued data, Computational Statistics, 21, 251-269, (2006) · Zbl 1113.62076 · doi:10.1007/s00180-006-0262-y
[14] D’Urso, P; Giovanni, L; Massari, R, Time series clustering by a robust autoregressive metric with application to air pollution, Chemometrics and Intelligent Laboratory Systems, 141, 107-124, (2015) · doi:10.1016/j.chemolab.2014.11.003
[15] D’Urso, P; Giovanni, L; Massari, R, Trimmed fuzzy clustering for interval-valued data, Advances in Data Analysis and Classification, 9, 21-40, (2015) · doi:10.1007/s11634-014-0169-3
[16] García-Escudero, LA; Gordaliza, A, A proposal for robust curve clustering, Journal of Classification, 22, 185-201, (2005) · Zbl 1336.62179 · doi:10.1007/s00357-005-0013-8
[17] Giordani, P; Kiers, HA, Three-way component analysis of interval-valued data, Journal of Chemometrics, 18, 253-264, (2004) · doi:10.1002/cem.868
[18] Gowda, KC; Diday, E, Symbolic clustering using a new dissimilarity measure, Pattern Recognition, 24, 567-578, (1991) · doi:10.1016/0031-3203(91)90022-W
[19] Guru, DS; Kiranagi, BB; Nagabhushan, P, Multivalued type proximity measure and concept of mutual similarity value useful for clustering symbolic patterns, Pattern Recognition Letters, 25, 1203-1213, (2004) · doi:10.1016/j.patrec.2004.03.016
[20] Hung, TW, The bi-objective fuzzy c-means cluster analysis for tsk fuzzy system identification, Fuzzy Optimization and Decision Making, 6, 51-61, (2007) · Zbl 1105.62067 · doi:10.1007/s10700-006-0024-x
[21] Kim, J; Krishnapuram, R; Davé, R, Application of the least trimmed squares technique to prototype-based clustering, Pattern Recognition Letters, 17, 633-641, (1996) · doi:10.1016/0167-8655(96)00028-1
[22] Krishnapuram, R; Joshi, A; Nasraoui, O; Yi, L, Low-complexity fuzzy relational clustering algorithms for web mining, IEEE Transactions on Fuzzy Systems, 9, 595-607, (2001) · doi:10.1109/91.940971
[23] Leite, D; Ballini, R; Costa, P; Gomide, F, Evolving fuzzy granular modeling from nonstationary fuzzy data streams, Evolving Systems, 3, 65-79, (2012) · doi:10.1007/s12530-012-9050-9
[24] Wu, KL; Yang, MS, Alternative c-means clustering algorithms, Pattern Recognition, 35, 2267-2278, (2002) · Zbl 1006.68876 · doi:10.1016/S0031-3203(01)00197-2
[25] Xu, Z, Fuzzy ordered distance measures, Fuzzy Optimization and Decision Making, 11, 73-97, (2012) · Zbl 1254.91119 · doi:10.1007/s10700-011-9113-6
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.