×

Subspace clustering for the finite mixture of generalized hyperbolic distributions. (English) Zbl 1474.62187

Summary: The finite mixture of generalized hyperbolic distributions is a flexible model for clustering, but its large number of parameters for estimation, especially in high dimensions, can make it computationally expensive to work with. In light of this issue, we provide an extension of the subspace clustering technique developed for finite Gaussian mixtures to that of generalized hyperbolic distribution. The methodology will be demonstrated with numerical experiments.

MSC:

62H12 Estimation in multivariate analysis
62H25 Factor analysis and principal components; correspondence analysis
62H30 Classification and discrimination; cluster analysis (statistical aspects)

Software:

QRM; mixsmsn; UCI-ml; PGMM
PDF BibTeX XML Cite
Full Text: DOI

References:

[1] Abramowitz M, Stegun IA (1964) Handbook of mathematical functions with formulas, graphs, and mathematical tables, ninth Dover printing, tenth GPO printing edition. Dover, New York
[2] Aitken, AC, On Bernoulli’s numerical solution of algebraic equations, Proc R Soc Edinb, 46, 289-305, (1926) · JFM 52.0098.05
[3] Baricz, A., Turán type inequalities for some probability density functions, Studia Scientiarum Mathematicarum Hungarica, 47, 175-189, (2010) · Zbl 1234.62010
[4] Barndorff-Nielsen, O., Hyperbolic distributions and distributions on hyperbolae, Scand J Stat, 5, 151-157, (1978) · Zbl 0386.60018
[5] Bellman RE (2003) Dynamic programming. Courier Corporation
[6] Böhning, D.; Dietz, E.; Schaub, R.; Schlattmann, P.; Lindsay, B., The distribution of the likelihood ratio for mixtures of densities from the one-parameter exponential family, Ann Inst Stat Math, 46, 373-388, (1994) · Zbl 0802.62017
[7] Bouveyron, C.; Brunet-Saumard, C., Model-based clustering of high-dimensional data: a review, Comput Stat Data Anal, 71, 52-78, (2013) · Zbl 1471.62032
[8] Bouveyron, C.; Girard, S.; Schmid, C., High-dimensional data clustering, Comput Stat Data Anal, 52, 502-519, (2007) · Zbl 1452.62433
[9] Browne, RP; McNicholas, PD, A mixture of generalized hyperbolic distributions, Can J Stat, 43, 176-198, (2015) · Zbl 1320.62144
[10] Campbell, NA; Mahon, RJ, A multivariate study of variation in two species of rock crab of genus leptograpsus, Aust J Zool, 22, 417-425, (1974)
[11] Dempster, AP; Laird, NM; Rubin, DB, Maximum likelihood from incomplete data via the EM algorithm, J R Stat Soc Ser B, 39, 1-38, (1977) · Zbl 0364.62022
[12] Dias DB, Madeo RCB, Rocha T, Biscaro HH, Peres SM (2009) Hand movement recognition for Brazilian sign language: a study using distance-based neural networks. In: 2009 international joint conference on neural networks, pp 697-704
[13] Dua D, Karra Taniskidou E (2017) UCI machine learning repository. School of Information and Computer Science, University of California, Irvine, CA. http://archive.ics.uci.edu/ml
[14] Flury, BN; Gautschi, W., An algorithm for simultaneous orthogonal transformation of several positive definite symmetric matrices to nearly diagonal form, SIAM J Sci Stat Comput, 7, 169-184, (1986) · Zbl 0614.65043
[15] Forina, M.; Armanino, C.; Castino, M.; Ubigli, M., Multivariate data analysis as a discriminating method of the origin of wines, Vitis, 25, 189-201, (1986)
[16] Ghahramani Z, Hinton G (1997) The EM algorithm for factor analyzers. Technical Report CRG-TR-96-1, University of Toronto, Toronto
[17] Hubert, L.; Arabie, P., Comparing partitions, J Classif, 2, 193-218, (1985) · Zbl 0587.62128
[18] Kailing K, Kriegel H-P, Kröger P (2004) Density-connected subspace clustering for high-dimensional data. In: Proceedings of the 2004 SIAM international conference on data mining, pp 246-256
[19] Kozubowski, T.; Podgórski, K.; Rychlik, I., Multivariate generalized laplace distribution and related random fields, J Multivar Anal, 113, 59-72, (2013) · Zbl 1260.60100
[20] McLachlan G, Peel G (2000) Finite mixture models. Wiley, New York · Zbl 0963.62061
[21] McNeil AJ, Frey R, Embrechts P (2005) Quantitative risk management: concepts, techniques and tools. Princeton University Press, Princeton · Zbl 1089.91037
[22] McNicholas, PD; Murphy, TB, Parsimonious Gaussian mixture models, Stat Comput, 18, 285-296, (2008)
[23] McNicholas PD, ElSherbiny A, McDaid AF, Murphy BT (2015) PGMM: parsimonious Gaussian mixture models. R package version 1.2. https://CRAN.R-project.org/package=pgmm. Accessed 1 June 2017
[24] McNicholas S, McNicholas P, Browne R (2017) A mixture of variance-gamma factor analyzers. In: Ahmed S (ed) Big and complex data analysis. Springer, Cham, pp 369-385 · Zbl 1381.62187
[25] Prates MO, Cabral CRB, Lachos VH (2013) mixsmsn: fitting finite mixture of scale mixture of skew-normal distributions. J Stat Softw 54(12):1-20. http://www.jstatsoft.org/v54/i12/
[26] Schwarz, G., Estimating the dimension of a model, Ann Stat, 6, 461-464, (1978) · Zbl 0379.62005
[27] Tortora, C.; McNicholas, PD; Browne, RP, A mixture of generalized hyperbolic factor analyzers, Adv Data Anal Classif, 10, 423-440, (2016) · Zbl 1414.62278
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.