×

An evolutionary algorithm with crossover and mutation for model-based clustering. (English) Zbl 07413948

Summary: An evolutionary algorithm (EA) is developed as an alternative to the EM algorithm for parameter estimation in model-based clustering. This EA facilitates a different search of the fitness landscape, i.e., the likelihood surface, utilizing both crossover and mutation. Furthermore, this EA represents an efficient approach to “hard” model-based clustering and so it can be viewed as a sort of generalization of the \(k\)-means algorithm, which is itself equivalent to a restricted Gaussian mixture model. The EA is illustrated on several datasets, and its performance is compared with that of other hard clustering approaches and model-based clustering via the EM algorithm.

MSC:

62H30 Classification and discrimination; cluster analysis (statistical aspects)

Software:

R; mclust; Flury; mixture
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] Andrews, JL; McNicholas, PD, Using evolutionary algorithms for model-based clustering, Pattern Recognition Letters, 34, 987-992 (2013) · doi:10.1016/j.patrec.2013.02.008
[2] Ashlock, D., Evolutionary Computation for Modeling and Optimization (2010), Springer-Verlag: New York, Springer-Verlag · Zbl 1102.68109
[3] Bagnato, L.; Punzo, A.; Zoia, MG, The multivariate leptokurtic-normal distribution and its application in model-based clustering, Canadian Journal of Statistics, 45, 1, 95-119 (2017) · Zbl 1462.62308 · doi:10.1002/cjs.11308
[4] Biernacki, C.; Celeux, G.; Govaert, G., Assessing a mixture model for clustering with the integrated completed likelihood, IEEE Transactions on Pattern Analysis and Machine Intelligence, 22, 7, 719-725 (2000) · doi:10.1109/34.865189
[5] Bouveyron, C.; Brunet-Saumard, C., Model-based clustering of high-dimensional data: a review, Computational Statistics and Data Analysis, 71, 52-78 (2014) · Zbl 1471.62032 · doi:10.1016/j.csda.2012.12.008
[6] Browne, RP; McNicholas, PD, Estimating common principal components in high dimensions, Advances in Data Analysis and Classification, 8, 2, 217-226 (2014) · Zbl 1474.62183 · doi:10.1007/s11634-013-0139-1
[7] Browne, R.P., & McNicholas, P.D. (2014b). Mixture: mixture models for clustering and classification. R package version 1.1. · Zbl 1325.62008
[8] Browne, RP; McNicholas, PD, Orthogonal Stiefel manifold optimization for eigen-decomposed covariance parameter estimation in mixture models, Statistics and Computing, 24, 2, 203-210 (2014) · Zbl 1325.62008 · doi:10.1007/s11222-012-9364-2
[9] Celeux, G.; Govaert, G., A classification EM algorithm for clustering and two stochastic versions, Computational Statistics and Data Analysis, 14, 3, 315-332 (1992) · Zbl 0937.62605 · doi:10.1016/0167-9473(92)90042-E
[10] Celeux, G.; Govaert, G., Gaussian parsimonious clustering models, Pattern Recognition, 28, 5, 781-793 (1995) · doi:10.1016/0031-3203(94)00125-6
[11] Dasgupta, A.; Raftery, AE, Detecting features in spatial point processes with clutter via model-based clustering, Journal of the American Statistical Association, 93, 294-302 (1998) · Zbl 0906.62105 · doi:10.1080/01621459.1998.10474110
[12] Dean, N.; Murphy, TB; Downey, G., Using unlabelled data to update classification rules with applications in food authenticity studies, Journal of the Royal Statistical Society: Series C, 55, 1, 1-14 (2006) · Zbl 1490.62155
[13] Dempster, AP; Laird, NM; Rubin, DB, Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society: Series B, 39, 1, 1-38 (1977) · Zbl 0364.62022
[14] Flury, B. (2012). Flury: data sets from flury, 1997. R package version 0.1-3.
[15] Forina, M.; Armanino, C.; Castino, M.; Ubigli, M., Multivariate data analysis as a discriminating method of the origin of wines, Vitis, 25, 189-201 (1986)
[16] Fraley, C.; Raftery, AE, Model-based clustering, discriminant analysis, and density estimation, Journal of the American Statistical Association, 97, 458, 611-631 (2002) · Zbl 1073.62545 · doi:10.1198/016214502760047131
[17] Fraley, C., Raftery, A.E., Murphy, T.B., & Scrucca, L. (2012). Mclust version 4 for R: Normal mixture modeling for model-based clustering, classification, and density estimation. Technical Report 597, Department of Statistics, University of Washington, Seattle, WA.
[18] Gallaugher, MPB; McNicholas, PD, Finite mixtures of skewed matrix variate distributions, Pattern Recognition, 80, 83-93 (2018) · doi:10.1016/j.patcog.2018.02.025
[19] Gallaugher, MPB; McNicholas, PD, On fractionally-supervised classification: Weight selection and extension to the multivariate t-distribution, Journal of Classification, 36, 2, 232-265 (2019) · Zbl 1436.62252 · doi:10.1007/s00357-018-9280-z
[20] Gallaugher, MPB; McNicholas, PD, Mixtures of skewed matrix variate bilinear factor analyzers, Advances in Data Analysis and Classification, 14, 2, 415-434 (2020) · Zbl 1474.62227 · doi:10.1007/s11634-019-00377-4
[21] Gallaugher, M.P.B., & McNicholas, P.D. (2020b). Parsimonious mixtures of matrix variate bilinear factor analyzers. In Imaizumi, T., Nakayama, A., & Yokoyama, S. (Eds.) Advanced studies in behaviormetrics and data science: Essays in honor of Akinori Okada (pp. 177-196). Singapore: Springer. · Zbl 1474.62227
[22] Ghahramani, Z.; Hinton, GE, The EM algorithm for factor analyzers Technical Report CRG-TR-96-1 (1997), Toronto: University Of Toronto, Toronto
[23] Hubert, L.; Arabie, P., Comparing partitions, Journal of Classification, 2, 1, 193-218 (1985) · Zbl 0587.62128 · doi:10.1007/BF01908075
[24] Hunter, DL; Lange, K., A tutorial on MM algorithms, The American Statistician, 58, 1, 30-37 (2004) · doi:10.1198/0003130042836
[25] Hurley, C., Clustering visualizations of multivariate data, Journal of Computational and Graphical Statistics, 13, 4, 788-806 (2004) · doi:10.1198/106186004X12425
[26] Kass, RE; Wasserman, L., A reference Bayesian test for nested hypotheses and its relationship to the Schwarz criterion, Journal of the American Statistical Association, 90, 431, 928-934 (1995) · Zbl 0851.62020 · doi:10.1080/01621459.1995.10476592
[27] Leroux, BG, Consistent estimation of a mixing distribution, The Annals of Statistics, 20, 3, 1350-1360 (1992) · Zbl 0763.62015 · doi:10.1214/aos/1176348772
[28] Lin, T-I; Wang, W-L; McLachlan, GJ; Lee, SX, Robust mixtures of factor analysis models using the restricted multivariate skew-t distribution, Statistical Modelling, 18, 50-72 (2018) · Zbl 07289498 · doi:10.1177/1471082X17718119
[29] McGrory, C.; Titterington, D., Variational approximations in Bayesian model selection for finite mixture distributions, Computational Statistics and Data Analysis, 51, 11, 5352-5367 (2007) · Zbl 1445.62050 · doi:10.1016/j.csda.2006.07.020
[30] McLachlan, G.J. (1982). The classification and mixture maximum likelihood approaches to cluster analysis. In Krishnaiah, P.R., & Kanal, L. (Eds.) Handbook of statistics, vol. 2, pp 199-208. Amsterdam: North-Holland. · Zbl 0513.62064
[31] McLachlan, GJ, Discriminant analysis and statistical pattern recognition (1992), New Jersey: John Wiley & Sons, New Jersey · Zbl 1108.62317 · doi:10.1002/0471725293
[32] McLachlan, GJ; Peel, D., Finite mixture models (2000), New York: John Wiley & Sons, New York · Zbl 0963.62061 · doi:10.1002/0471721182
[33] McLachlan, G.J., & Peel, D. (2000b). Mixtures of factor analyzers. In Proceedings of the seventh international conference on machine learning, San Francisco, pp 599-606. Morgan Kaufmann.
[34] McNicholas, PD, Model-based classification using latent Gaussian mixture models, Journal of Statistical Planning and Inference, 140, 5, 1175-1181 (2010) · Zbl 1181.62095 · doi:10.1016/j.jspi.2009.11.006
[35] McNicholas, PD, Mixture model-based classification (2016), Boca Raton: Chapman & Hall/CRC Press, Boca Raton · Zbl 1454.62005 · doi:10.1201/9781315373577
[36] McNicholas, PD, Model-based clustering, Journal of Classification, 33, 3, 331-373 (2016) · Zbl 1364.62155 · doi:10.1007/s00357-016-9211-9
[37] McNicholas, PD; Murphy, TB, Parsimonious gaussian mixture models, Statistics and Computing, 18, 3, 285-296 (2008) · doi:10.1007/s11222-008-9056-0
[38] McNicholas, PD; Murphy, TB, Model-based clustering of microarray expression data via latent gaussian mixture models, Bioinformatics, 26, 21, 2705-2712 (2010) · doi:10.1093/bioinformatics/btq498
[39] Melnykov, V.; Zhu, X., On model-based clustering of skewed matrix data, Journal of Multivariate Analysis, 167, 181-194 (2018) · Zbl 1395.62165 · doi:10.1016/j.jmva.2018.04.007
[40] Melnykov, V.; Zhu, X., Studying crime trends in the USA over the years 2000-2012, Advances in Data Analysis and Classification, 13, 1, 325-341 (2019) · Zbl 1459.62220 · doi:10.1007/s11634-018-0326-1
[41] Morris, K.; Punzo, A.; McNicholas, PD; Browne, RP, Asymmetric clusters and outliers: Mixtures of multivariate contaminated shifted asymmetric Laplace distributions, Computational Statistics and Data Analysis, 132, 145-166 (2019) · Zbl 1507.62136 · doi:10.1016/j.csda.2018.12.001
[42] Murray, PM; Browne, RP; McNicholas, PD, Mixtures of hidden truncation hyperbolic factor analyzers, Journal of Classification, 37, 2, 366-379 (2020) · Zbl 07223606 · doi:10.1007/s00357-019-9309-y
[43] Pesevski, A.; Franczak, BC; McNicholas, PD, Subspace clustering with the multivariate-t distribution, Pattern Recognition Letters, 112, 1, 297-302 (2018) · doi:10.1016/j.patrec.2018.07.003
[44] R Core Team, R: a language and environment for statistical computing (2018), Vienna, Austria: R Foundation for Statistical Computing, Vienna, Austria
[45] Rand, WM, Objective criteria for the evaluation of clustering methods, Journal of the American Statistical Association, 66, 336, 846-850 (1971) · doi:10.1080/01621459.1971.10482356
[46] Roeder, K.; Wasserman, L., Practical Bayesian density estimation using mixtures of normals, Journal of the American Statistical Association, 92, 894-902 (1997) · Zbl 0889.62021 · doi:10.1080/01621459.1997.10474044
[47] Sarkar, S., Zhu, X., Melnykov, V., & Ingrassia, S. (2020). On parsimonious models for modeling matrix data. Computational Statistics and Data Analysis, 142. · Zbl 1507.62152
[48] Schwarz, G., Estimating the dimension of a model, The Annals of Statistics, 6, 461-464 (1978) · Zbl 0379.62005 · doi:10.1214/aos/1176344136
[49] Scott, DW, Multivariate density estimation (1992), New York: Wiley, New York · Zbl 0850.62006 · doi:10.1002/9780470316849
[50] Steinley, D., Properties of the Hubert-Arabie adjusted Rand index, Psychological Methods, 9, 386-396 (2004) · doi:10.1037/1082-989X.9.3.386
[51] Subedi, S.; McNicholas, PD, Variational Bayes approximations for clustering via mixtures of normal inverse Gaussian distributions, Advances in Data Analysis and Classification, 8, 2, 167-193 (2014) · Zbl 1459.62122 · doi:10.1007/s11634-014-0165-7
[52] Subedi, S., & McNicholas, P.D. (2019). A variational approximations-DIC rubric for parameter estimation and mixture model selection within a family setting. Journal of Classification. To appear. doi:10.1007/s00357-019-09351-3. · Zbl 07370653
[53] Titterington, DM; Smith, AFM; Makov, UE, Statistical analysis of finite mixture distributions (1985), Chichester: John Wiley & Sons, Chichester · Zbl 0646.62013
[54] Tortora, C.; Franczak, BC; Browne, RP; McNicholas, PD, A mixture of coalesced generalized hyperbolic distributions, Journal of Classification, 36, 1, 26-57 (2019) · Zbl 1433.62172 · doi:10.1007/s00357-019-09319-3
[55] Vermunt, JK, K-means may perform as well as mixture model clustering but may also be much worse: Comment on Steinley and Brusco, Psychological Methods, 16, 1, 82-88 (2011) · doi:10.1037/a0020144
[56] Vrbik, I.; McNicholas, PD, Fractionally-supervised classification, Journal of Classification, 32, 3, 359-381 (2015) · Zbl 1331.62319 · doi:10.1007/s00357-015-9188-9
[57] Wallace, ML; Buysse, DJ; Germain, A.; Hall, MH; Iyengar, S., Variable selection for skewed model-based clustering: Application to the identification of novel sleep phenotypes, Journal of the American Statistical Association, 113, 521, 95-110 (2018) · Zbl 1398.62347 · doi:10.1080/01621459.2017.1330202
[58] Wei, Y.; Tang, Y.; McNicholas, PD, Flexible high-dimensional unsupervised learning with missing data, IEEE Transactions on Pattern Analysis and Machine Intelligence, 42, 3, 610-621 (2020) · doi:10.1109/TPAMI.2018.2885760
[59] Wolfe, J.H. (1965). A computer program for the maximum-likelihood analysis of types. USNPRA Technical Bulletin 65-15, U.S.Naval Personal Research Activity, San Diego.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.