×

Computationally efficient learning of multivariate \(t\) mixture models with missing information. (English) Zbl 1189.62095

Maximum likelihood estimation is considered for a finite mixture of multivariate-\(t\) distributions and observations with missing at random. A parameter expanded minimization-expectation (PX-EM) algorithm is developed for approximate calculation of the estimates. Its performance is compared to the usual EM algorithm via simulations. Applications to wine recognition data and blue crabs morphological measurements data are presented.

MSC:

62H12 Estimation in multivariate analysis
62-08 Computational methods for problems pertaining to statistics
62F10 Point estimation
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Basford KE, McLachlan GJ (1985) Estimation of allocation rates in a cluster analysis context. J Am Stat Assoc 80: 286–293 · doi:10.1080/01621459.1985.10478110
[2] Bensmail H, Celeux G, Raftery AE, Robert CP (1997) Inference in model-based cluster analysis. Stat Comput 7: 1–10 · doi:10.1023/A:1018510926151
[3] Brooks SP, Giudici P, Roberts GO (2003) Efficient construction of reversible jump Markov chain Monte Carlo proposal distributions (with discussion). J R Stat Soc Ser B 65: 3–55 · Zbl 1063.62120 · doi:10.1111/1467-9868.03711
[4] Campbell NA, Mahon RJ (1974) A multivariate study of variation in two species of rock crab of genus Leptograpsus. Aust J Zool 22: 417–425 · doi:10.1071/ZO9740417
[5] Chib S, Greenberg E (1995) Understanding the Metropolis–Hastings algorithm. Am Stat 49: 327–335
[6] Dellaportas P, Papageorgiou I (2006) Multivariate mixtures of normals with unknown number of components. Stat Comput 16: 57–68 · doi:10.1007/s11222-006-5338-6
[7] Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm (with discussion). J R Stat Soc Ser B 39: 1–38 · Zbl 0364.62022
[8] Diebolt J, Robert CP (1994) Estimation of finite mixture distributions through Bayesian sampling. J R Stat Soc Ser B 56: 363–375 · Zbl 0796.62028
[9] Escobar MD, West M (1995) Bayesian density estimation and inference using mixtures. J Am Stat Assoc 90: 577–588 · Zbl 0826.62021 · doi:10.1080/01621459.1995.10476550
[10] Frühwirth-Schnatter S (2006) Finite mixture and Markov switching models. Springer, New York · Zbl 1108.62002
[11] Ghahramani Z, Jordan MI (1994) Supervised learning from incomplete data via an EM approach. In: Cowan JD, Tesarro G, Alspector J (eds) Advances in neural information processing systems, vol 6. Morgan Kaufmann, San Francisco, pp 120–127
[12] Green PJ (1995) Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82: 711–732 · Zbl 0861.62023 · doi:10.1093/biomet/82.4.711
[13] Hastings WK (1970) Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57: 97–109 · Zbl 0219.65008 · doi:10.1093/biomet/57.1.97
[14] Lin TI, Lee JC, Ho HJ (2006) On fast supervised learning for normal mixture models with missing information. Pattern Recogn 39: 1177–1187 · Zbl 1096.68723 · doi:10.1016/j.patcog.2005.12.014
[15] Lin TI, Lee JC, Ni HF (2004) Bayesian analysis of mixture modelling using the multivariate t distribution. Stat Comput 14: 119–130 · doi:10.1023/B:STCO.0000021410.33077.10
[16] Little RJA, Rubin DB (2002) Statistical analysis with missing data, 2nd edn. Wiley, New York
[17] Liu CH, Rubin DB (1994) The ECME algorithm: a simple extension of EM and ECM with faster monotone convergence. Biometrika 81: 633–648 · Zbl 0812.62028 · doi:10.1093/biomet/81.4.633
[18] Liu CH, Rubin DB (1995) ML estimation of the t distribution using EM and its extensions, ECM and ECME. Stat Sin 5: 19–39 · Zbl 0824.62047
[19] Liu CH, Rubin DB, Wu YN (1998) Parameter expansion to accelerate EM: the PX-EM algorithm. Biometrika 85: 755–770 · Zbl 0921.62071 · doi:10.1093/biomet/85.4.755
[20] McLachlan GJ, Basford KE (1988) Mixture models: inference and application to clustering. Marcel Dekker, New York · Zbl 0697.62050
[21] McLachlan GJ, Peel D (2000) Finite mixture models. Wiely, New York · Zbl 0963.62061
[22] Meng XL, Rubin DB (1993) Maximum likelihood estimation via the ECM algorithm: a general framework. Biometrika 80: 267–278 · Zbl 0778.62022 · doi:10.1093/biomet/80.2.267
[23] Meng XL, van Dyk D (1997) The EM algorithm–an old folk song sung to a fast new tune (with discussion). J R Stat Soc Ser B 59: 511–567 · Zbl 1090.62518 · doi:10.1111/1467-9868.00082
[24] Peel D, McLachlan GJ (2000) Robust mixture modeling using the t distribution. Stat Comput 10: 339–348 · doi:10.1023/A:1008981510081
[25] Richardson S, Green PJ (1997) On Bayesian analysis of mixtures with an unknown number of components. J R Stat Soc Ser B 59: 731–792 · Zbl 0891.62020 · doi:10.1111/1467-9868.00095
[26] Rubin DB (1976) Inference and missing data. Biometrika 63: 581–592 · Zbl 0344.62034 · doi:10.1093/biomet/63.3.581
[27] Schafer JL (1997) Analysis of incomplete multivariate data. Chapman and Hall, London · Zbl 0997.62510
[28] Shoham S (2002) Robust clustering by deterministic agglomeration EM of mixtures of multivariate t-distributions. Pattern Recogn 35: 1127–1142 · Zbl 1005.68051 · doi:10.1016/S0031-3203(01)00080-2
[29] Shoham S, Fellows MR, Normann RA (2003) Robust, automatic spike sorting using mixtures of multivariate t-distributions. J Neurosci Methods 127: 111–122 · doi:10.1016/S0165-0270(03)00120-1
[30] Stone M (1974) Cross-validatory choice and assessment of statistical prediction (with discussion). J R Stat Soc Ser B 36: 111–147 · Zbl 0308.62063
[31] Titterington DM, Smith AFM, Markov UE (1985) Statistical analysis of finite mixture distributions. Wiely, New York
[32] Wang HX, Zhang QB, Luo B, Wei S (2004) Robust mixture modelling using multivariate t distribution with missing information. Pattern Recogn Lett 25: 701–710 · doi:10.1016/j.patrec.2004.01.010
[33] Zhang ZH, Chan KL, Wu YM, Chen CB (2004) Learning a multivariate gaussian mixture model with the reversible jump MCMC algorithm. Stat Comput 14: 343–355 · doi:10.1023/B:STCO.0000039484.36470.41
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.