Bayesian shrinkage in mixture-of-experts models: identifying robust determinants of class membership. (English) Zbl 1474.62088

Summary: A method for implicit variable selection in mixture-of-experts frameworks is proposed. We introduce a prior structure where information is taken from a set of independent covariates. Robust class membership predictors are identified using a normal gamma prior. The resulting model setup is used in a finite mixture of Bernoulli distributions to find homogenous clusters of women in Mozambique based on their information sources on HIV. Fully Bayesian inference is carried out via the implementation of a Gibbs sampler.


62F15 Bayesian inference
62J07 Ridge regression; shrinkage estimators (Lasso)
62H30 Classification and discrimination; cluster analysis (statistical aspects)
90-08 Computational methods for problems pertaining to operations research and mathematical programming
62P10 Applications of statistics to biology and medical sciences; meta analysis
Full Text: DOI arXiv


[1] Agadjanian, V., Gender, religious involvement, and hiv/aids prevention in mozambique, Soc Sci Med, 61, 1529-1539 (2005)
[2] Allenby, Greg M.; Ginter, James L., Using Extremes to Design Products and Segment Markets, Journal of Marketing Research, 32, 392-403 (1995)
[3] Anderson, G.; Farcomeni, A.; Pittau, MG; Zelli, R., A new approach to measuring and studying the characteristics of class membership: examining poverty, inequality and polarization in urban China, J Econom, 191, 348-359 (2016)
[4] Audet, CM; Burlison, J.; Moon, TD; Sidat, M.; Vergara, AE; Vermund, SH, Sociocultural and epidemiological aspects of HIV/AIDS in Mozambique, BMC Int Health Hum Rights, 10, 15 (2010)
[5] Bhattacharya, A.; Pati, D.; Pillai, NS; Dunson, DB, Dirichlet-laplace priors for optimal shrinkage, J Am Stat Assoc, 110, 1479-1490 (2015) · Zbl 1373.62368
[6] Bitto, A.; Frühwirth-Schnatter, S., Achieving shrinkage in a time-varying parameter model framework, J Econom (2018) · Zbl 1452.62216
[7] Celeux, G.; Chauveau, D.; Diebolt, J., Stochastic versions of the em algorithm: an experimental study in the mixture case, J Stat Comput Simul, 55, 287-314 (1996) · Zbl 0907.62024
[8] Celeux, G.; Frühwirth-Schnatter, S.; Robert, CP; Frühwirth-Schnatter, S. (ed.); Celeux, G. (ed.); Robert, CP (ed.), Model selection for mixture models-perspectives and strategies, 121-160 (2018), Boca Raton · Zbl 1419.62001
[9] Dayton, CM; Macready, GB, Concomitant-variable latent-class models, J Am Stat Assoc, 83, 173-178 (1988)
[10] Devijver, E.; etal., Finite mixture regression: a sparse variable selection by model selection for clustering, Electron J Stat, 9, 2642-2674 (2015) · Zbl 1329.62279
[11] Dias JG (2010) Modeling demographic and health survey (dhs) data by latent class models: an application. In: Proceedings of the 12th WSEAS international conference on Mathematical and computational methods in science and engineering, World Scientific and Engineering Academy and Society (WSEAS), pp 79-83
[12] Frühwirth-Schnatter, S., Bayesian model discrimination and bayes factors for linear gaussian state space models, J Royal Stat Soc Ser B (Methodol), 57, 237-246 (1995) · Zbl 0809.62023
[13] Frühwirth-Schnatter, S., Markov chain monte carlo estimation of classical and dynamic switching and mixture models, J Am Stat Assoc, 96, 194-209 (2001) · Zbl 1015.62022
[14] Frühwirth-Schnatter, S., Estimating marginal likelihoods for mixture and markov switching models using bridge sampling techniques, Econom J, 7, 143-167 (2004) · Zbl 1053.62087
[15] Frühwirth-Schnatter S (2006) Finite mixture and Markov switching models. Springer, Berlin · Zbl 1108.62002
[16] Frühwirth-Schnatter, S.; Kaufmann, S., Model-based clustering of multiple time series, J Bus Econ Stat, 26, 78-89 (2008)
[17] Frühwirth-Schnatter, S.; Wagner, H.; Bernardo, J. (ed.); Bayarri, M. (ed.); Berger, J. (ed.); Dawid, A. (ed.); Heckerman, D. (ed.); Smith, A. (ed.); West, M. (ed.), Bayesian variable selection for random intercept modeling of gaussian and non-gaussian data, No. 9, 165 (2011), Oxford
[18] Frühwirth-Schnatter, S.; Pamminger, C.; Weber, A.; Winter-Ebmer, R., Labor market entry and earnings dynamics: Bayesian inference using mixtures-of-experts markov chain clustering, J Appl Econom, 27, 1116-1137 (2012)
[19] George, EI; McCulloch, RE, Variable selection via gibbs sampling, J Am Stat Assoc, 88, 881-889 (1993)
[20] Geweke, J., Bayesian inference in econometric models using monte carlo integration, Econom J Econom Soc, 53, 1317-1339 (1989) · Zbl 0683.62068
[21] Ghosh, J.; Herring, AH; Siega-Riz, AM, Bayesian variable selection for latent class models, Biometrics, 67, 917-925 (2011) · Zbl 1226.62022
[22] Gormley, IC; Frühwirth-Schnatter, S.; Frühwirth-Schnatter, S. (ed.); Celeux, G. (ed.); Robert, CP (ed.), Mixture of expert models, 279-315 (2018), Boca Raton
[23] Gormley, IC; Murphy, TB, A mixture of experts model for rank data with applications in election studies, Ann Appl Stat, 2, 1452-1477 (2008) · Zbl 1454.62498
[24] Gormley, IC; Murphy, TB, A mixture of experts latent position cluster model for social network data, Stat Methodol, 7, 385-405 (2010) · Zbl 1233.62205
[25] Green, PJ, Reversible jump markov chain monte carlo computation and bayesian model determination, Biometrika, 82, 711-732 (1995) · Zbl 0861.62023
[26] Griffin, JE; Brown, PJ, Inference with normal-gamma prior distributions in regression problems, Bayesian Anal, 5, 171-188 (2010) · Zbl 1330.62128
[27] Gronau, QF; Sarafoglou, A.; Matzke, D.; Ly, A.; Boehm, U.; Marsman, M.; Leslie, DS; Forster, JJ; Wagenmakers, EJ; Steingroever, H., A tutorial on bridge sampling, J Math Psychol, 81, 80-97 (2017) · Zbl 1402.62042
[28] Guhaniyogi, R.; Dunson, DB, Bayesian compressed regression, J Am Stat Assoc, 110, 1500-1514 (2015) · Zbl 1373.62100
[29] Gupta, M.; Ibrahim, JG, Variable selection in regression mixture modeling for the discovery of gene regulatory networks, J Am Stat Assoc, 102, 867-880 (2007) · Zbl 1469.62369
[30] Hörmann, W.; Leydold, J., Generating generalized inverse gaussian random variates, Stat Comput, 24, 547-557 (2014) · Zbl 1325.62031
[31] Huber, F.; Feldkircher, M., Adaptive shrinkage in bayesian vector autoregressive models, J Bus Econ Stat, 37, 1-13 (2017)
[32] Huerta, G.; Jiang, W.; Tanner, MA, Time series modeling via hierarchical mixtures, Stat Sin, 13, 1097-1118 (2003) · Zbl 1036.62076
[33] Hurn, M.; Justel, A.; Robert, CP, Estimating mixtures of regressions, J Comput Graph Stat, 12, 55-79 (2003)
[34] Ingrassia, S.; Minotti, SC; Punzo, A., Model-based clustering via linear cluster-weighted models, Comput Stat Data Anal, 71, 159-182 (2014) · Zbl 1471.62095
[35] Ingrassia, S.; Punzo, A.; Vittadini, G.; Minotti, S., The generalized linear mixed cluster-weighted model, J Classif, 32, 85-113 (2015) · Zbl 1331.62310
[36] Jacobs, RA; Jordan, MI; Nowlan, SJ; Hinton, GE, Adaptive mixtures of local experts, Neural Comput, 3, 79-87 (1991)
[37] Jasra, A.; Holmes, CC; Stephens, DA, Markov chain monte carlo methods and the label switching problem in bayesian mixture modeling, Stat Sci, 20, 50-67 (2005) · Zbl 1100.62032
[38] Jiang, W.; Tanner, MA, On the identifiability of mixtures-of-experts, Neural Netw, 12, 1253-1258 (1999)
[39] Kastner, G., Sparse Bayesian time-varying covariance estimation in many dimensions, J Econom (2018) · Zbl 1452.62773
[40] Koop G (2003) Bayesian Econometrics. Wiley, New York. https://books.google.at/books?id=WRK3AAAAIAAJ
[41] Lazarsfeld, PF, Latent structure analysis, Psychol Study Sci, 3, 476-543 (1959)
[42] Lenk, PJ; DeSarbo, WS, Bayesian inference for finite mixtures of generalized linear models with random effects, Psychometrika, 65, 93-119 (2000) · Zbl 1291.62225
[43] Leydold J, Hörmann W (2015) Gigrvg: Random variate generator for the gig distribution. R package version 04
[44] Lubrano, M.; Ndoye, AAJ, Income inequality decomposition using a finite mixture of log-normal distributions: a Bayesian approach, Comput Stat Data Anal, 100, 830-846 (2016) · Zbl 1466.62151
[45] Malsiner-Walli, G.; Frühwirth-Schnatter, S.; Grün, B., Model-based clustering based on sparse finite gaussian mixtures, Stat Comput, 26, 303-324 (2016) · Zbl 1342.62109
[46] Meng, XL; Wong, WH, Simulating ratios of normalizing constants via a simple identity: a theoretical exploration, Stat Sin, 6, 831-860 (1996) · Zbl 0857.62017
[47] Mukhopadhyay M, Dunson DB (2017) Targeted random projection for prediction from high-dimensional features. arXiv preprint arXiv:1712.02445
[48] Newton, MA; Raftery, AE, Approximate Bayesian inference with the weighted likelihood bootstrap, J Royal Stat Soc Ser B (Methodol), 56, 3-48 (1994) · Zbl 0788.62026
[49] Pfarrhofer, M.; Piribauer, P., Flexible shrinkage in high-dimensional bayesian spatial autoregressive models, Spat Stat, 29, 109-128 (2019)
[50] Polson, NG; Scott, JG; Windle, J., Bayesian inference for logistic models using pólya-gamma latent variables, J Am Stat Assoc, 108, 1339-1349 (2013) · Zbl 1283.62055
[51] Quandt, RE, A new approach to estimating switching regressions, J Am Stat Assoc, 67, 306-310 (1972) · Zbl 0237.62047
[52] R Development Core Team (2008) R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, http://www.R-project.org, ISBN 3-900051-07-0
[53] Redner, RA; Walker, HF, Mixture densities, maximum likelihood and the em algorithm, SIAM Rev, 26, 195-239 (1984) · Zbl 0536.62021
[54] Robert C, Casella G (2013) Monte Carlo statistical methods. Springer, Berlin · Zbl 1096.62003
[55] Rossi, Peter E.; McCulloch, Robert E.; Allenby, Greg M., The Value of Purchase History Data in Target Marketing, Marketing Science, 15, 321-340 (1996)
[56] Stephens, M., Bayesian analysis of mixture models with an unknown number of components-an alternative to reversible jump methods, Ann Stat, 28, 40-74 (2000) · Zbl 1106.62316
[57] Stephens, M., Dealing with label switching in mixture models, J Royal Stat Soc Ser B (Stat Methodol), 62, 795-809 (2000) · Zbl 0957.62020
[58] Tang, X.; Qu, A., Mixture modeling for longitudinal data, J Comput Graph Stat, 25, 1117-1137 (2016)
[59] Villani, M.; Kohn, R.; Nott, DJ, Generalized smooth finite mixtures, J Econom, 171, 121-133 (2012) · Zbl 1443.62085
[60] Wedel M, Kamakura WA (2012) Market segmentation: conceptual and methodological foundations, vol 8. Springer, Berlin
[61] Yuksel, SE; Wilson, JN; Gader, PD, Twenty years of mixture of experts, IEEE Trans Neural Netw Learn Syst, 23, 1177-1193 (2012)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.