Mixtures of skewed matrix variate bilinear factor analyzers. (English) Zbl 1474.62227

Summary: In recent years, data have become increasingly higher dimensional and, therefore, an increased need has arisen for dimension reduction techniques for clustering. Although such techniques are firmly established in the literature for multivariate data, there is a relative paucity in the area of matrix variate, or three-way, data. Furthermore, the few methods that are available all assume matrix variate normality, which is not always sensible if cluster skewness or excess kurtosis is present. Mixtures of bilinear factor analyzers using skewed matrix variate distributions are proposed. In all, four such mixture models are presented, based on matrix variate skew-\(t\), generalized hyperbolic, variance-gamma, and normal inverse Gaussian distributions, respectively.


62H30 Classification and discrimination; cluster analysis (statistical aspects)
62H25 Factor analysis and principal components; correspondence analysis
Full Text: DOI arXiv


[1] Anderlucci, L.; Viroli, C., Covariance pattern mixture models for the analysis of multivariate heterogeneous longitudinal data, Ann Appl Stat, 9, 2, 777-800 (2015) · Zbl 1397.62214
[2] Andrews, JL; McNicholas, PD, Extending mixtures of multivariate t-factor analyzers, Stat Comput, 21, 3, 361-373 (2011) · Zbl 1255.62171
[3] Andrews, JL; McNicholas, PD, Model-based clustering, classification, and discriminant analysis via mixtures of multivariate \(t\)-distributions: the \(t\) EIGEN family, Stat Comput, 22, 5, 1021-1029 (2012) · Zbl 1252.62062
[4] Baum, LE; Petrie, T.; Soules, G.; Weiss, N., A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains, Ann Math Stat, 41, 164-171 (1970) · Zbl 0188.49603
[5] Bezanson, J.; Edelman, A.; Karpinski, S.; Shah, V. B., Julia: a fresh approach to numerical computing, SIAM Rev, 59, 1, 65-98 (2017) · Zbl 1356.68030
[6] Browne, RP; McNicholas, PD, A mixture of generalized hyperbolic distributions, Can J Stat, 43, 2, 176-198 (2015) · Zbl 1320.62144
[7] Chen, JT; Gupta, AK, Matrix variate skew normal distributions, Statistics, 39, 3, 247-253 (2005) · Zbl 1070.62039
[8] Dang, UJ; Browne, RP; McNicholas, PD, Mixtures of multivariate power exponential distributions, Biometrics, 71, 4, 1081-1089 (2015) · Zbl 1419.62330
[9] Domínguez-Molina, JA; González-Farías, G.; Ramos-Quiroga, R.; Gupta, AK, A matrix variate closed skew-normal distribution with applications to stochastic frontier analysis, Commun Stat Theory Methods, 36, 9, 1691-1703 (2007) · Zbl 1122.62043
[10] Franczak, BC; Browne, RP; McNicholas, PD, Mixtures of shifted asymmetric Laplace distributions, IEEE Trans Pattern Anal Mach Intell, 36, 6, 1149-1157 (2014)
[11] Gallaugher, MPB; McNicholas, PD, A matrix variate skew-t distribution, Stat, 6, 1, 160-170 (2017)
[12] Gallaugher, MPB; McNicholas, PD, Finite mixtures of skewed matrix variate distributions, Pattern Recogn, 80, 83-93 (2018)
[13] Gallaugher MPB, McNicholas PD (2018b) Mixtures of matrix variate bilinear factor analyzers. In: Proceedings of the joint statistical meetings. American Statistical Association, Alexandria, VA. arXiv:1712.08664
[14] Gallaugher, MPB; McNicholas, PD, Three skewed matrix variate distributions, Stat Probab Lett, 145, 103-109 (2019) · Zbl 1414.62173
[15] Ghahramani Z, Hinton GE (1997) The EM algorithm for factor analyzers. Technical report CRG-TR-96-1, University of Toronto, Toronto, Canada
[16] Harrar, SW; Gupta, AK, On matrix variate skew-normal distributions, Statistics, 42, 2, 179-194 (2008) · Zbl 1281.62132
[17] Karlis, D.; Santourian, A., Model-based clustering with non-elliptically contoured distributions, Stat Comput, 19, 1, 73-83 (2009)
[18] Lee, S.; McLachlan, GJ, Finite mixtures of multivariate skew t-distributions: some recent and new results, Stat Comput, 24, 181-202 (2014) · Zbl 1325.62107
[19] Lin, T-I, Robust mixture modeling using multivariate skew t distributions, Stat Comput, 20, 3, 343-356 (2010)
[20] Lin, T-I; McNicholas, PD; Hsiu, JH, Capturing patterns via parsimonious t mixture models, Stat Probab Lett, 88, 80-87 (2014) · Zbl 1369.62131
[21] McNicholas, PD, Model-based classification using latent Gaussian mixture models, J Stat Plan Inference, 140, 5, 1175-1181 (2010) · Zbl 1181.62095
[22] McNicholas, PD, Mixture model-based classification (2016), Boca Raton: Chapman & Hall/CRC Press, Boca Raton
[23] McNicholas, PD; Murphy, TB, Parsimonious Gaussian mixture models, Stat Comput, 18, 3, 285-296 (2008)
[24] McNicholas, PD; Murphy, TB, Model-based clustering of microarray expression data via latent Gaussian mixture models, Bioinformatics, 26, 21, 2705-2712 (2010)
[25] McNicholas, PD; Tait, PA, Data science with Julia (2019), Boca Raton: Chapman & Hall/CRC Press, Boca Raton
[26] McNicholas, SM; McNicholas, PD; Browne, RP; Ahmed, SE, A mixture of variance-gamma factor analyzers, Big and complex data analysis: methodologies and applications, 369-385 (2017), Cham: Springer, Cham · Zbl 1381.62187
[27] Melnykov, V.; Zhu, X., On model-based clustering of skewed matrix data, J Multivar Anal, 167, 181-194 (2018) · Zbl 1395.62165
[28] Melnykov, V.; Zhu, X., Studying crime trends in the USA over the years 2000-2012, Adv Data Anal Classif, 13, 1, 325-341 (2019) · Zbl 1459.62220
[29] Meng, X-L; van Dyk, D., The EM algorithm—an old folk song sung to a fast new tune (with discussion), J R Stat Soc B, 59, 3, 511-567 (1997) · Zbl 1090.62518
[30] Morris, K.; McNicholas, PD, Dimension reduction for model-based clustering via mixtures of shifted asymmetric Laplace distributions, Stat Probab Lett, 83, 9, 2088-2093 (2013) · Zbl 1282.62153
[31] Murray, PM; Browne, RB; McNicholas, PD, Mixtures of skew-t factor analyzers, Comput Stat Data Anal, 77, 326-335 (2014) · Zbl 06984029
[32] Murray, PM; McNicholas, PD; Browne, RB, A mixture of common skew-\(t\) factor analyzers, Stat, 3, 1, 68-82 (2014)
[33] Murray, PM; Browne, RB; McNicholas, PD, Hidden truncation hyperbolic distributions, finite mixtures thereof, and their application for clustering, J Multivar Anal, 161, 141-156 (2017) · Zbl 1403.62028
[34] Peel, D.; McLachlan, GJ, Robust mixture modelling using the t distribution, Stat Comput, 10, 4, 339-348 (2000)
[35] Počuča N, Gallaugher MPB, McNicholas PD (2019) MatrixVariate.jl: a complete statistical framework for analyzing matrix variate data. Julia package version 0.2.0. http://github.com/nikpocuca/MatrixVariate.jl
[36] Scott, AJ; Symons, MJ, Clustering methods based on likelihood ratio criteria, Biometrics, 27, 387-397 (1971)
[37] Tait PA, McNicholas PD (2019) Clustering higher order data: finite mixtures of multidimensional arrays. arXiv preprint arXiv:1907.08566
[38] Tang, Y.; Browne, RP; McNicholas, PD, Flexible clustering of high-dimensional data via mixtures of joint generalized hyperbolic distributions, Stat, 7, 1, e177 (2018)
[39] Tiedeman, DV; Sells, SB, On the study of types, Symposium on pattern analysis (1955), Randolph Field: Air University, U.S.A.F. School of Aviation Medicine, Randolph Field
[40] Tortora, C.; Franczak, BC; Browne, RP; McNicholas, PD, A mixture of coalesced generalized hyperbolic distributions, J Classif, 36, 1, 26-57 (2019) · Zbl 1433.62172
[41] Viroli, C., Finite mixtures of matrix normal distributions for classifying three-way data, Stat Comput, 21, 4, 511-522 (2011) · Zbl 1221.62083
[42] Viroli, C., Model based clustering for three-way data structures, Bayesian Anal, 6, 573-602 (2011) · Zbl 1330.62262
[43] Vrbik, I.; McNicholas, PD, Analytic calculations for the EM algorithm for multivariate skew-t mixture models, Stat Probab Lett, 82, 6, 1169-1174 (2012) · Zbl 1244.65012
[44] Vrbik, I.; McNicholas, PD, Parsimonious skew mixture models for model-based clustering and classification, Comput Stat Data Anal, 71, 196-210 (2014) · Zbl 1471.62202
[45] Wishart, J., The generalised product moment distribution in samples from a normal multivariate population, Biometrika, 20A, 1-2, 32-52 (1928)
[46] Wolfe JH (1965) A computer program for the maximum likelihood analysis of types. Technical bulletin 65-15, U.S. Naval Personnel Research Activity
[47] Xie, X.; Yan, S.; Kwok, JT; Huang, TS, Matrix-variate factor analysis and its applications, IEEE Trans Neural Netw, 19, 10, 1821-1826 (2008)
[48] Yu S, Bi J, Ye J (2008) Probabilistic interpretations and extensions for a family of 2D PCA-style algorithms. In: Workshop data mining using matrices and tensors (DMMT 08): proceedings of a workshop held in conjunction with the 14th ACM SIGKDD international conference on knowledge discovery and data mining (SIGKDD 2008)
[49] Zhao, J.; Philip, L.; Kwok, JT, Bilinear probabilistic principal component analysis, IEEE Trans Neural Netw Learn Syst, 23, 3, 492-503 (2012)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.