Eigenvalues and constraints in mixture modeling: geometric and computational issues. (English) Zbl 1414.62071

Summary: This paper presents a review about the usage of eigenvalues restrictions for constrained parameter estimation in mixtures of elliptical distributions according to the likelihood approach. The restrictions serve a twofold purpose: to avoid convergence to degenerate solutions and to reduce the onset of non interesting (spurious) local maximizers, related to complex likelihood surfaces. The paper shows how the constraints may play a key role in the theory of Euclidean data clustering. The aim here is to provide a reasoned survey of the constraints and their applications, considering the contributions of many authors and spanning the literature of the last 30 years.


62F10 Point estimation
62F12 Asymptotic properties of parametric estimators
62F30 Parametric inference under constraints
62F35 Robustness and adaptive procedures (parametric inference)


TCLUST; mclust
Full Text: DOI Link


[1] Andrews, JL; McNicholas, PD, Extending mixtures of multivariate \(t\)-factor analyzers, Stat Comput, 21, 361-373, (2011) · Zbl 1255.62171
[2] Andrews, JL; McNicholas, PD; Subedi, S., Model-based classification via mixtures of multivariate \(t\)-distributions, Comput Stat Data Anal, 55, 520-529, (2011) · Zbl 1247.62151
[3] Banfield, JD; Raftery, AE, Model-based Gaussian and non-Gaussian clustering, Biometrics, 49, 803-821, (1993) · Zbl 0794.62034
[4] Baudry, J-P; Celeux, G., EM for mixtures, Stat Comput, 22, 1021-1029, (2015) · Zbl 1331.62301
[5] Biernacki, C., Initializing EM using the properties of its trajectories in gaussian mixtures, Stat Comput, 14, 267-279, (2004)
[6] Biernacki, C.; Chrétien, S., Degeneracy in the maximum likelihood estimation of univariate Gaussian mixtures with the EM, Stat Probab Lett, 61, 373-382, (2003) · Zbl 1038.62023
[7] Biernacki, C.; Celeux, G.; Govaert, G., Choosing starting values for the em algorithm for getting the highest likelihood in multivariate Gaussian mixture models, Comput Stat Data Anal, 41, 561-575, (2003) · Zbl 1429.62235
[8] Boyles, RA, On the convergence of the EM algorithm, J R Stat Soc B, 45, 47-50, (1983) · Zbl 0508.62030
[9] Celeux, G.; Govaert, G., Gaussian parsimonious clustering models, Pattern Recognit, 28, 781-793, (1995)
[10] Cerioli A, García-Escudero LA, Mayo-Iscar A, Riani M (2016) Finding the number of groups in model-based clustering via constrained likelihoods. http://uvadoc.uva.es/handle/10324/18093
[11] Chanda, KC, A note on the consistency and maxima of the roots of likelihood equations, Biometrika, 41, 56-61, (1954) · Zbl 0055.12901
[12] Ciuperca, G.; Ridolfi, A.; Idier, J., Penalized maximum likelihood estimator for normal mixtures, Scand J Stat, 30, 45-59, (2003) · Zbl 1034.62018
[13] Cramér H (1946) Math Methods Stat. Princeton University Press, Princeton
[14] Day, N., Estimating the components of a mixture of normal distributions, Biometrika, 56, 463-474, (1969) · Zbl 0183.48106
[15] Dempster, A.; Laird, N.; Rubin, D., Maximum likelihood from incomplete data via the EM algorithm, J R Stat Soc Ser B (Methodol), 39, 1-38, (1977) · Zbl 0364.62022
[16] Dennis JE (1981) Algorithms for non linear fitting. In: Proceedings of the NATO advanced research symposium, Cambridge, England. Cambridge University
[17] Dykstra, RL, An algorithm for restricted least squares regression, J Am Stat Assoc, 78, 837-842, (1983) · Zbl 0535.62063
[18] Fang K, Anderson T (1990) Statistical inference in elliptically contoured and related distributions. Alberton, New York · Zbl 0747.00016
[19] Fraley C, Raftery AE (2006) Mclust version 3: an R package for normal mixture modeling and model-based clustering. Technical report, DTIC Document
[20] Fraley, C.; Raftery, A., Bayesian regularization for normal mixture estimation and model-based clustering, J Classif, 24, 155-181, (2007) · Zbl 1159.62302
[21] Fraley C, Raftery A, Murphy T, Scrucca L (2012) mclust version 4 for R: normal mixture modeling for model-based clustering, classification, and density estimation. University of Washington, Seattle
[22] Fritz, H.; García-Escudero, LA; Mayo-Iscar, A., tclust: an R package for a trimming approach to cluster analysis, J Stat Softw, 47, 1-26, (2012)
[23] Fritz, H.; García-Escudero, LA; Mayo-Iscar, A., A fast algorithm for robust constrained clustering, Comput Stat Data Anal, 61, 124-136, (2013) · Zbl 1349.62264
[24] Gallegos MT (2002) Maximum likelihood clustering with outliers. In: Jajuga K, Sokolowski A, Bock H-H (eds) Classification, clustering, and data analysis: recent advances and applications. Springer, Berlin, pp 247-255
[25] Gallegos, M.; Ritter, G., Trimmed ML estimation of contaminated mixture, Sankhya (Ser A), 71, 164-220, (2009) · Zbl 1193.62021
[26] García-Escudero, LA; Gordaliza, A.; Matrán, C.; Mayo-Iscar, A., A general trimming approach to robust cluster analysis, Ann Stat, 36, 1324-1345, (2008) · Zbl 1360.62328
[27] García-Escudero, LA; Gordaliza, A.; Mayo-Iscar, A., A constrained robust proposal for mixture modeling avoiding spurious solutions, Adv Data Anal Classif, 8, 27-43, (2014)
[28] García-Escudero, LA; Gordaliza, A.; Matrán, C.; Mayo-Iscar, A., Avoiding spurious local maximizers in mixture modelling, Stat Comput, 25, 619-633, (2015) · Zbl 1331.62100
[29] García-Escudero, LA; Gordaliza, A.; Greselin, F.; Ingrassia, S.; Mayo-Iscar, A., The joint role of trimming and constraints in robust estimation for mixtures of Gaussian factor analyzers, Comput Stat Data Anal, 99, 131-147, (2016) · Zbl 06697663
[30] García-Escudero LA, Gordaliza A, Greselin F, Ingrassia S, Mayo-Iscar A (2017) Eigenvalues in robust approaches to mixture modeling: a review. Technical report, in preparation · Zbl 06697663
[31] Ghahramani Z, Hinton G (1997) The EM algorithm for factor analyzers. Technical report CRG-TR-96-1, University of Toronto
[32] Greselin, F.; Ingrassia, S., Constrained monotone em algorithms for mixtures of multivariate \(t\)-distributions, Stat Comput, 20, 9-22, (2010)
[33] Greselin, F.; Ingrassia, S., Maximum likelihood estimation in constrained parameter spaces for mixtures of factor analyzers, Stat Comput, 25, 215-226, (2015) · Zbl 1331.62307
[34] Greselin, F.; Ingrassia, S.; Punzo, A., Assessing the pattern of covariance matrices via an augmentation multiple testing procedure, Stat Methods Appl, 20, 141-170, (2011) · Zbl 1232.62090
[35] Hathaway, RJ, A constrained formulation of maximum-likelihood estimation for normal mixture distributions, Ann Stat, 13, 795-800, (1985) · Zbl 0576.62039
[36] Hathaway, RJ, A constrained EM algorithm for univariate normal mixtures, J Stat Comput Simul, 23, 211-230, (1996)
[37] Hennig, C., Breakdown points for maximum likelihood estimators of location-scale mixtures, Ann Stat, 32, 1313-1340, (2004) · Zbl 1047.62063
[38] Ingrassia, S., A comparison between the simulated annealing and the EM algorithms in normal mixture decompositions, Stat Comput, 2, 203-211, (1992)
[39] Ingrassia, S., A likelihood-based constrained algorithm for multivariate normal mixture models, Stat Methods Appl, 13, 151-166, (2004) · Zbl 1205.62066
[40] Ingrassia, S.; Rocci, R., Constrained monotone EM algorithms for finite mixture of multivariate gaussians, Comput Stat Data Anal, 51, 5339-5351, (2007) · Zbl 1445.62116
[41] Ingrassia, S.; Rocci, R., Degeneracy of the EM algorithm for the MLE of multivariate Gaussian mixtures and dynamic constraints, Comput Stat Data Anal, 55, 1715-1725, (2011) · Zbl 1328.65030
[42] Kiefer, J.; Wolfowitz, J., Consistency of the maximum likelihood estimator in the presence of infinitely many incidental parameters, Ann Math Stat, 27, 887-906, (1956) · Zbl 0073.14701
[43] Kiefer, NM, Discrete parameter variation: efficient estimation of a switching regression model, Econometrica, 46, 427-434, (1978) · Zbl 0408.62058
[44] Lindsay BG (1995) Mixture models: theory, geometry and applications. NSF-CBMS regional conference series in probability and statistics, vol 5. Institute of Mathematical Statistics, Hayward, CA
[45] McLachlan G, Krishnan T (2008a) The EM algorithm and extensions, 2nd edn, vol 589. Wiley, New York · Zbl 1165.62019
[46] McLachlan GJ, Krishnan T (2008b) The EM algorithm and its extensions, 2nd edn. Wiley, New York · Zbl 1165.62019
[47] McLachlan GJ, Peel D (2000) Finite mixture models. Wiley, New York · Zbl 0963.62061
[48] McNicholas, PD; Murphy, TB, Parsimonious Gaussian mixture models, Stat Comput, 18, 285-296, (2008)
[49] Meng, X-L, On the rate of convergence of the ECM algorithm, Ann Stat, 22, 326-339, (1994) · Zbl 0803.65146
[50] Meng, X-L; Dyk, D., The EM algorithm. An old folk song sung to a fast new tune, J R Stat Soc B, 59, 511-567, (1997) · Zbl 1090.62518
[51] Nettleton, D., Convergence properties of the EM algorithm in constrained spaces, Canad J Stat, 27, 639-644, (1999) · Zbl 0942.62033
[52] O’Hagan A, White A (2016) Improved model-based clustering performance using bayes initialization averaging. Technical report, arXiv:1504.06870v4
[53] O’Hagan, A.; Murphy, TB; Gormley, C., Computational aspects of fitting mixture model via the expectation-maximisation algorithm, Comput Stat Data Anal, 56, 3843-3864, (2013) · Zbl 1255.62180
[54] Puntanen S, Styan GP, Isotalo J (2011) Matrix tricks for linear statistical models. Springer, Berlin · Zbl 1291.62014
[55] Redner, RA; Walker, HF, Mixture densities maximum likelihood and the EM algorithm, SIAM Rev, 26, 195-239, (1984) · Zbl 0536.62021
[56] Ritter G (2014) Cluster analysis and variable selection. CRC Press, Boca Raton · Zbl 1341.62037
[57] Rocci R, Gattone SA, Di Mari R (2016) A data driven equivariant approach to constrained Gaussian mixture modeling. Adv Data Anal Classif. doi:10.1007/s11634-016-0279-1
[58] Rousseeuw PJ, Leroy AM (2005) Robust regression and outlier detection. Wiley, New York · Zbl 0711.62030
[59] Rudin W (1976) Principles of mathematical analysis, 3rd edn. McGraw-Hill, Tokyo · Zbl 0346.26002
[60] Scrucca, L.; Fop, M.; Murphy, TB; Raftery, A., mclust 5: clustering, classification and density estimation using Gaussian finite mixture models, R J, 8, 289-317, (2016)
[61] Seo, B.; Kim, D., Root selection in normal mixture models, Comput Stat Data Anal, 56, 2454-2470, (2012) · Zbl 1252.62013
[62] Subedi, S.; Punzo, A.; Ingrassia, S.; McNicholas, PD, Clustering and classification via cluster-weighted factor analyzers, Adv Data Anal Classif, 7, 5-40, (2013) · Zbl 1271.62137
[63] Subedi S, Punzo A, Ingrassia S, McNicholas PD (2015) Cluster-weighted \(t\)-factor analyzers for robust model-based clustering and dimension reduction. Stat Methods Appl 24(4):623-649 · Zbl 1416.62362
[64] Tanaka, K., Strong consistency of the maximum likelihood estimator for finite mixtures of location-scale distributions when penalty is imposed on the ratios of the scale parameters, Scand J Stat, 36, 171-184, (2009) · Zbl 1190.62031
[65] Tanaka, K.; Takemura, A., Strong consistency of the maximum likelihood estimator for finite mixtures of location-scale distributions when the scale parameters are exponentially small, Bernoulli, 12, 1003-1017, (2006) · Zbl 1117.62025
[66] Tarone, RD; Gruenhage, G., A note on the uniqueness of the roots of the likelihood equations for vector-valued parameters, J Am Stat Assoc, 70, 903-904, (1975) · Zbl 0328.62018
[67] Tarone, RD; Gruenhage, G., Corrigenda: a note on the uniqueness of the roots of the likelihood equations for vector-valued parameters, J Am Stat Assoc, 74, 744, (1979)
[68] Theobald, C., An inequality with application to multivariate analysis, Biometrika, 62, 461-466, (1975) · Zbl 0316.62020
[69] Theobald, C., Corrections and amendments: an inequality with application to multivariate analysis, Biometrika, 63, 685, (1976)
[70] Tipping, M.; Bishop, CM, Mixtures of probabilistic principal component mixtures of probabilistic principal component analysers, Neural Comput, 11, 443-482, (1999) · Zbl 0924.62068
[71] Titterington DM, Smith AFM, Makov UE (1985) Statistical analysis of finite mixture distributions. Wiley, New York · Zbl 0646.62013
[72] van Laarhoven PJM, Aarts EHL (1988) Simulated annealing: theory and practice. D. Reidel, Dordecht
[73] Wu, CFJ, On convergence properties of the EM algorithm, Ann Stat, 11, 95-103, (1983) · Zbl 0517.62035
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.