Variational Bayes approximations for clustering via mixtures of normal inverse Gaussian distributions. (English) Zbl 1459.62122

Summary: Parameter estimation for model-based clustering using a finite mixture of normal inverse Gaussian (NIG) distributions is achieved through variational Bayes approximations. Univariate NIG mixtures and multivariate NIG mixtures are considered. The use of variational Bayes approximations here is a substantial departure from the traditional EM approach and alleviates some of the associated computational complexities and uncertainties. Our variational algorithm is applied to simulated and real data. The paper concludes with discussion and suggestions for future work.


62H30 Classification and discrimination; cluster analysis (statistical aspects)


R; robustbase; rrcov; MASS (R)
Full Text: DOI arXiv


[1] Abramowitz M, Stegun I (1972) Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, 9th edn. Dover Press, New York · Zbl 0543.33001
[2] Akaike H (1973) Information theory and an extension of the maximum likelihood principle. In: Second international symposium on information theory, vol 1. Springer, Berlin, pp 267-281 · Zbl 0283.62006
[3] Andrews JL, McNicholas PD (2011) Extending mixtures of multivariate t-factor analyzers. Stat Comput 21(3):361-373 · Zbl 1255.62171
[4] Andrews JL, McNicholas PD (2012) Model-based clustering, classification, and discriminant analysis via mixtures of multivariate t-distributions. Stat Comput 22(5):1021-1029 · Zbl 1252.62062
[5] Andrews JL, McNicholas PD, Subedi S (2011) Model-based classification via mixtures of multivariate t-distributions. Comput Stat Data Anal 55:520-529 · Zbl 1247.62151
[6] Baek J, McLachlan GJ (2011) Mixtures of common t-factor analyzers for clustering high-dimensional microarray data. Bioinformatics 27:1269-1276
[7] Baek J, McLachlan GJ, Flack LK (2010) Mixtures of factor analyzers with common factor loadings: applications to the clustering and visualization of high-dimensional data. IEEE Trans Pattern Anal Mach Intell 32(7):1298-1309
[8] Barndorff-Nielsen OE (1997) Normal inverse Gaussian distributions and stochastic volatility modelling. Scand J Stat 24(1):1-13 · Zbl 0934.62109
[9] Baum LE, Petrie T, Soules G, Weiss N (1970) A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. Ann Math Stat 41:164-171 · Zbl 0188.49603
[10] Beal MJ (2003) Variational algorithms for approximate Bayesian inference. PhD thesis, University of London
[11] Bechtel Y, Bonaiti-Pellie C, Poisson N, Magnette J, Bechtel P (1993) A population and family study of \[NN\]-acetyltransferase using caffeine urinary metabolites. Clin Pharmacol Ther 54(2):134-141
[12] Bouveyron C, Girard S, Schmid C (2007) High-dimensional data clustering. Comput Stat Data Anal 52(1):502-519 · Zbl 1452.62433
[13] Browne RP, McNicholas PD, Sparling MD (2012) Model-based learning using a mixture of mixtures of Gaussian and uniform distributions. IEEE Trans Pattern Anal Machine Intell 34(4):814-817
[14] Celeux G, Govaert G (1995) Gaussian parsimonious clustering models. Pattern Recognit 28:781-793
[15] Chhikara RS, Folks JL (1989) The Inverse Gaussian Distribution: Theory, Methodology, and Applications, Statistics: Textbooks and Monographs, vol 95. Marcel Dekker Inc, New York
[16] Corduneanu, A.; Bishop, CM, Variational Bayesian model selection for mixture distributions, 27-34 (2001), Los Altos
[17] Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B 39(1):1-38 · Zbl 0364.62022
[18] Fraley C, Raftery AE (2002) Model-based clustering, discriminant analysis, and density estimation. J Am Stat Assoc 97(458):611-631 · Zbl 1073.62545
[19] Franczak BC, Browne RP, McNicholas PD (2012) Mixtures of shifted asymmetric Laplace distributions. arXiv:1207.1727v3
[20] Ghahramani Z, Hinton GE (1997) The EM algorithm for factor analyzers. Tech. Rep. CRG-TR-96-1, University of Toronto, Toronto
[21] Hastie T, Tibshirani R (1996) Discriminant analysis by Gaussian mixtures. J R Stat Soc Ser B 58(1):155-176 · Zbl 0850.62476
[22] Hubert L, Arabie P (1985) Comparing partitions. J Classif 2:193-218
[23] Jordan MI, Ghahramani Z, Jaakkola TS, Saul LK (1999) An introduction to variational methods for graphical models. Mach Learn 37:183-233 · Zbl 0945.68164
[24] Jørgensen B (1982) Statistical Properties of the Generalized Inverse Gaussian Distribution, vol 21. Springer, New York
[25] Karlis D, Lillestol J (2004) Bayesian estimation of NIG models via Markov chain Monte Carlo methods. Appl Stoch Models Business Ind 20:323-338 · Zbl 1063.91035
[26] Karlis D, Santourian A (2009) Model-based clustering with non-elliptically contoured distributions. Stat Comput 19(1):73-83
[27] Lee SX, McLachlan GJ (2013) On mixtures of skew normal and skew t-distributions. Adv Data Anal Classif 7(3):241-266 · Zbl 1273.62115
[28] Lillestol J (2000) Risk analysis and the NIG distribution. J Risk 2:41-56
[29] Lin TI (2009) Maximum likelihood estimation for multivariate skew normal mixture models. J Multivariate Anal 100:257-265 · Zbl 1152.62034
[30] Lin TI (2010) Robust mixture modeling using multivariate skew t distributions. Stat Comput 20:343-356
[31] McGrory CA, Titterington DM (2007) Variational approximations in Bayesian model selection for finite mixture distributions. Comput Stat Data Anal 51:5352-5367 · Zbl 1445.62050
[32] McLachlan, GJ; Peel, D., Mixtures of factor analyzers, 599-606 (2000), San Francisco
[33] McNicholas PD (2010) Model-based classification using latent Gaussian mixture models. J Stat Plan Infer 140(5):1175-1181 · Zbl 1181.62095
[34] McNicholas PD, Murphy TB (2008) Parsimonious Gaussian mixture models. Stat Comput 18:285-296
[35] McNicholas PD, Murphy TB (2010) Model-based clustering of longitudinal data. Can J Stat 38(1):153-168 · Zbl 1190.62120
[36] McNicholas PD, Subedi S (2012) Clustering gene expression time course data using mixtures of multivariate t-distributions. J Stat Plan Infer 142(5):1114-1127 · Zbl 1236.62068
[37] McNicholas PD, Murphy TB, McDaid AF, Frost D (2010) Serial and parallel implementations of model-based clustering via parsimonious Gaussian mixture models. Comput Stat Data Anal 54(3):711-723 · Zbl 1464.62131
[38] Morris K, McNicholas PD (2013a) Dimension reduction for model-based clustering via mixtures of shifted asymmetric Laplace distributions. Stat Probab Lett 83(9):2088-2093 · Zbl 1282.62153
[39] Morris K, McNicholas PD (2013b) Non-Gaussian mixtures for dimension reduction, clustering, classification, and discriminant analysis. arXiv:1308.6315
[40] Morris K, McNicholas PD, Scrucca L (2013) Dimension reduction for model-based clustering via mixtures of multivariate t-distributions. Adv Data Anal Classif 7(3):321-338 · Zbl 1273.62141
[41] Murray PM, Browne RP, McNicholas PD (2013a) Mixtures of skew-\[t\] t factor analyzers. arXiv:1305.4301v2
[42] Murray PM, McNicholas PD, Browne RP (2013b) Mixtures of common skew-\[t\] t factor analyzers. arXiv:1307.5558v2
[43] Orchard, T.; Woodbury, MA; Cam, LM (ed.); Neyman, J. (ed.); Scott, EL (ed.), A missing information principle: theory and applications, No. 1, 697-715 (1972), Berkeley
[44] Punzo A, McNicholas PD (2013) Outlier detection via parsimonious mixtures of contaminated Gaussian distributions. arXiv:1305.4669
[45] Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6(2):461-464 · Zbl 0379.62005
[46] Seshadri V (1993) The inverse Gaussian distribution: a case study in exponential families. Oxford University Press, New York
[47] Steane MA, McNicholas PD, Yada R (2012) Model-based classification via mixtures of multivariate t-factor analyzers. Commun Stat 41(4):510-523 · Zbl 1294.62142
[48] Sundberg R (1974) Maximum likelihood theory for incomplete data from an exponential family. Scand J Stat 1:49-58 · Zbl 0284.62014
[49] Teschendorff A, Wang Y, Barbosa-Morais N, Brenton J, Caldas C (2005) A variational Bayesian mixture modelling framework for cluster analysis of gene-expression data. Bioinformatics 21(13):3025-3033
[50] Titterington DM, Smith AFM, Makov UE (1985) Statistical Analysis of Finite Mixture Distributions. Wiley, Chichester · Zbl 0646.62013
[51] Todorov V, Filzmoser P (2009) An object-oriented framework for robust multivariate analysis. J Stat Softw 32(3):1-47
[52] Venables WN, Ripley BD (2002) Modern Applied Statistics with S, 4th edn. Springer, New York · Zbl 1006.62003
[53] Vrbik I, McNicholas PD (2012) Analytic calculations for the EM algorithm for multivariate skew-t mixture models. Stat Probab Lett 82(6):1169-1174 · Zbl 1244.65012
[54] Vrbik I, McNicholas PD (2014) Parsimonious skew mixture models for model-based clustering and classification. Comput Stat Data Anal 71:196-210 · Zbl 1471.62202
[55] Waterhouse S, MacKay D, Robinson T (1996) Bayesian methods for mixture of experts. In: Advances in neural information processing systems, vol 8. MIT Press, Cambridge
[56] Wolfe JH (1963) Object cluster analysis of social areas. Master’s thesis, University of California, Berkeley
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.