Improved initialisation of model-based clustering using Gaussian hierarchical partitions. (English) Zbl 1414.62272

Summary: Initialisation of the EM algorithm in model-based clustering is often crucial. Various starting points in the parameter space often lead to different local maxima of the likelihood function and, so to different clustering partitions. Among the several approaches available in the literature, model-based agglomerative hierarchical clustering is used to provide initial partitions in the popular mclust R package. This choice is computationally convenient and often yields good clustering partitions. However, in certain circumstances, poor initial partitions may cause the EM algorithm to converge to a local maximum of the likelihood function. We propose several simple and fast refinements based on data transformations and illustrate them through data examples.


62H30 Classification and discrimination; cluster analysis (statistical aspects)


R; clusfind; mclust; PGMM; Flury; Mixmod
Full Text: DOI arXiv Link


[1] Auder B, Lebret R, Lovleff S, Langrognet F (2014) Rmixmod: an interface for MIXMOD. http://CRAN.R-project.org/package=Rmixmod, R package version 2.0.2
[2] Banfield, J.; Raftery, AE, Model-based Gaussian and non-Gaussian clustering, Biometrics, 49, 803-821, (1993) · Zbl 0794.62034
[3] Biernacki, C.; Celeux, G.; Govaert, G., Assessing a mixture model for clustering with the integrated completed likelihood, IEEE Trans Pattern Anal Mach Intell, 22, 719-725, (2000)
[4] Biernacki, C.; Celeux, G.; Govaert, G., Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models, Comput Stat Data Anal, 41, 561-575, (2003) · Zbl 1429.62235
[5] Biernacki, C.; Celeux, G.; Govaert, G.; Langrognet, F., Model-based cluster and discriminant analysis with the MIXMOD software, Comput Stat Data Anal, 51, 587-600, (2006) · Zbl 1157.62431
[6] Celeux, G.; Govaert, G., Gaussian parsimonious clustering models, Pattern Recognit, 28, 781-793, (1995)
[7] Dempster, AP; Laird, NM; Rubin, DB, Maximum likelihood from incomplete data via the EM algorithm (with discussion), J R Stat Soc Series B Stat Methodol, 39, 1-38, (1977) · Zbl 0364.62022
[8] Everitt B, Landau S, Leese M, Stahl D (2011) Cluster analysis, 5th edn. Wiley, Chichester, UK · Zbl 1274.62003
[9] Flury B (1997) A first course in multivariate statistics. Springer, New York · Zbl 0879.62052
[10] Forina, M.; Armanino, C.; Castino, M.; Ubigli, M., Multivariate data analysis as a discriminating method of the origin of wines, Vitis, 25, 189-201, (1986)
[11] Fraley, C., Algorithms for model-based Gaussian hierarchical clustering, SIAM J Sci Compu, 20, 270-281, (1998) · Zbl 0911.62052
[12] Fraley, C.; Raftery, AE, How many clusters? Which clustering method? Answers via model-based cluster analysis, Comput J, 41, 578-588, (1998) · Zbl 0920.68038
[13] Fraley, C.; Raftery, AE, Model-based clustering, discriminant analysis, and density estimation, J Am Stat Assoc, 97, 611-631, (2002) · Zbl 1073.62545
[14] Fraley C, Raftery AE, Murphy TB, Scrucca L (2012) MCLUST version 4 for R: normal mixture modeling for model-based clustering, classification, and density estimation. Technical Report 597, Department of Statistics, University of Washington
[15] Fraley C, Raftery AE, Scrucca L (2015) mclust: normal mixture modelling for model-based clustering, classification, and density estimation. http://CRAN.R-project.org/package=mclust, R package version 5.0.1
[16] Gordon AD (1999) Classification, 2nd edn. Chapman & Hall/CRC
[17] Hubert, L.; Arabie, P., Comparing partitions, J Classif, 2, 193-218, (1985)
[18] Jain AK, Dubes RC (1988) Algorithms for clustering data. Prentice-Hall, Inc
[19] Kaufman L, Rousseeuw PJ (1990) Finding groups in data: an introduction to cluster analysis. Wiley, UK · Zbl 1345.62009
[20] Maitra, R., Initializing partition-optimization algorithms, IEEE/ACM Trans Comput Biol Bioinform, 6, 144-157, (2009)
[21] McLachlan G, Krishnan T (2008) The EM algorithm and extensions, 2nd edn. Wiley-Interscience, Hoboken, New Jersey · Zbl 1165.62019
[22] McLachlan G, Peel D (2000) Finite mixture models. Wiley, New York · Zbl 0963.62061
[23] McLachlan, GJ, On the choice of starting values for the EM algorithm in fitting mixture models, Statistician, 37, 417, (1988)
[24] McNicholas PD, ElSherbiny A, McDaid AF, Murphy TB (2015) pgmm: Parsimonious Gaussian Mixture Models. http://CRAN.R-project.org/package=pgmm, R package version 1.2
[25] Melnykov, V.; Maitra, R., Finite mixture models and model-based clustering, Stat Surv, 4, 80-116, (2010) · Zbl 1190.62121
[26] Melnykov, V.; Melnykov, I., Initializing the EM algorithm in Gaussian mixture models with an unknown number of components, Comput Stat Data Anal, 56, 1381-1395, (2012) · Zbl 1246.65025
[27] Milligan, GW; Cooper, MC, A study of the comparability of external criteria for hierarchical cluster analysis, Multivar Behav Res, 21, 441-458, (1986)
[28] Raftery, AE; Dean, N., Variable selection for model-based clustering, J Am Stat Assoc, 101, 168-178, (2006) · Zbl 1118.62339
[29] Schwartz, G., Estimating the dimension of a model, Ann Stat, 6, 31-38, (1978)
[30] Wu, CJ, On the convergence properties of the EM algorithm, Ann Stat, 11, 95-103, (1983) · Zbl 0517.62035
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.