×

Maximum likelihood estimation of heterogeneous mixtures of Gaussian and uniform distributions. (English) Zbl 1203.62017

Summary: Existence and consistency of maximum likelihood estimators of the parameters of heterogeneous mixtures of Gaussian and uniform distributions with known number of components are shown under constraints to prevent the likelihood from degeneration and to ensure identifiability. The EM-algorithm is discussed, and for the special case with a single uniform component a practical scheme to find a good local optimum is proposed. The method is compared theoretically and empirically to the estimation of a Gaussian mixture with “noise component” as introduced by J. D. Banfield and A. E. Raftery [Biometrics 49, No. 3, 803–821 (1993; Zbl 0794.62034)] to find out whether it is a worthwhile alternative particularly in situations with outliers and points not belonging to the Gaussian components.

MSC:

62F10 Point estimation
65C60 Computational problems in statistics (MSC2010)
62H30 Classification and discrimination; cluster analysis (statistical aspects)
62F12 Asymptotic properties of parametric estimators

Citations:

Zbl 0794.62034

Software:

mclust
PDF BibTeX XML Cite
Full Text: DOI

References:

[1] Banfield, J.; Raftery, A.E., Model-based gaussian and non-Gaussian clustering, Biometrics, 49, 803-821, (1993) · Zbl 0794.62034
[2] Coretto, P., 2008. The noise component in model-based clustering. Ph.D. Thesis, Department of Statistical Science, University College London, http://www.ontherubicon.com/pietro/docs/phdthesis.pdf
[3] Coretto, P., Hennig, C., in press. A simulation study to compare robust clustering methods based on mixtures. Advances in Data Analysis and Classification, doi:10.1007/s11634-010-0065-4 · Zbl 1284.62366
[4] Cuesta-Albertos, J.A.; Gordaliza, A.; Matrán, C., Trimmed k-means: an attempt to robustify quantizers, Annals of statistics, 25, 553-576, (1997) · Zbl 0878.62045
[5] Day, N.E., Estimating the components of a mixture of normal distributions, Biometrika, 56, 463-474, (1969) · Zbl 0183.48106
[6] Dennis, J.E.J. (Ed.), 1981. Algorithms for nonlinear fitting. In: NATO Advanced Research Symposium. Cambridge University Press, Cambridge, England.
[7] DeSarbo, W.S.; Cron, W.L., A maximum likelihood methodology for clusterwise linear regression, Journal of classification, 5, 249-282, (1988) · Zbl 0692.62052
[8] Fraley, C.; Raftery, A.E., How many clusters? which clustering method? answers via model-based cluster analysis, The computer journal, 41, 578-588, (1998), doi:10.1093/comjnl/41.8.578 · Zbl 0920.68038
[9] Fraley, C.; Raftery, A.E., Model-based clustering, discriminant analysis, and density estimation, Journal of the American statistical association, 97, 611-631, (2002) · Zbl 1073.62545
[10] Fraley, C., Raftery, A.E., September 2006. Mclust version 3 for R: normal mixture modeling and model-based clustering. Technical Report 504, University of Washington, Department of Statistics.
[11] García-Escudero, L.A.; Gordaliza, A.; Matrán, C.; Mayo-Iscar, A., A general trimming approach to robust cluster analysis, Annals of statistics, 38, 3, 1324-1345, (2008) · Zbl 1360.62328
[12] Hathaway, R.J., A constrained formulation of maximum-likelihood estimation for normal mixture distributions, The annals of statistics, 13, 795-800, (1985) · Zbl 0576.62039
[13] Hathaway, R.J., A constrained EM algorithm for univariate normal mixtures, Journal of statistical computation and simulation, 23, 211-230, (1986)
[14] Hennig, C., Breakdown points for maximum likelihood estimators of location-scale mixtures, The annals of statistics, 32, 4, 1313-1340, (2004) · Zbl 1047.62063
[15] Ingrassia, S., A likelihood-based constrained algorithm for multivariate normal mixture models, Statistical methods and applications, 13, 2, 151-166, (2004) · Zbl 1205.62066
[16] Karlis, D.; Xekalaki, E., Choosing initial values for the EM algorithm for finite mixtures, Computational statistics and data analysis, 41, 3-4, 577-590, (2003) · Zbl 1429.62082
[17] Kiefer, N.M.; Wolfowitz, J., Consistency of the maximum likelihood estimation in the presence of infinitely many incidental parameter, Annals of mathematical statistics, 27, 364, 887-906, (1956) · Zbl 0073.14701
[18] Perlman, M.D., On the strong consistency of approximate maximum likelihood estimator, (), 263-282
[19] Redner, R., Note on the consistency of the maximum likelihood estimate for nonidentifiable distributions, The annals of statistics, 9, 225-228, (1981) · Zbl 0453.62021
[20] Redner, R.; Walker, H.F., Mixture densities, maximum likelihood and the EM algorithm, SIAM review, 26, 195-239, (1984) · Zbl 0536.62021
[21] Tanaka, K.; Takemura, A., Strong consistency of the MLE for finite location-scale mixtures when the scale parameters are exponentially small, Bernoulli, 12, 1003-1017, (2006) · Zbl 1117.62025
[22] Teicher, H., Identifiability of mixtures, The annals of mathematical statistics, 32, 244-248, (1961) · Zbl 0146.39302
[23] Wald, A., Note on the consistency of the maximum likelihood estimate, The annals of mathematical statistics, 20, 595-601, (1949) · Zbl 0034.22902
[24] Wu, C.F.J., On the convergence properties of the EM algorithm, Annals of statistics, 11, 1, 95-103, (1983) · Zbl 0517.62035
[25] Yakowitz, S.J.; Spragins, J., On the identifiability of finite mixtures, The annals of mathematical statistics, 39, 209-214, (1968) · Zbl 0155.25703
[26] Yao, W., A profile likelihood method for normal mixture with unequal variance, Journal of statistical planning and inference, 140, 7, 2089-2098, (2010) · Zbl 1184.62029
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.