×

A mixture latent variable model for modeling mixed data in heterogeneous populations and its applications. (English) Zbl 1421.62074

Summary: Latent variable models are widely used for jointly modeling of mixed data including nominal, ordinal, count and continuous data. In this paper, we consider a latent variable model for jointly modeling relationships between mixed binary, count and continuous variables with some observed covariates. We assume that, given a latent variable, mixed variables of interest are independent and count and continuous variables have Poisson distribution and normal distribution, respectively. As such data may be extracted from different subpopulations, consideration of an unobserved heterogeneity has to be taken into account. A mixture distribution is considered (for the distribution of the latent variable) which accounts the heterogeneity. The generalized EM algorithm which uses the Newton-Raphson algorithm inside the EM algorithm is used to compute the maximum likelihood estimates of parameters. The standard errors of the maximum likelihood estimates are computed by using the supplemented EM algorithm. Analysis of the primary biliary cirrhosis data is presented as an application of the proposed model.

MSC:

62H30 Classification and discrimination; cluster analysis (statistical aspects)
62J05 Linear regression; mixed models
62J12 Generalized linear models (logistic models)
62P10 Applications of statistics to biology and medical sciences; meta analysis
PDF BibTeX XML Cite
Full Text: DOI

References:

[1] Amiri, L., Khazaei, M., Ganjali, M.: General location model with factor analyzer covariance matrix structure and its applications. Adv. Data Anal. Classif. (2016). doi:10.1007/s11634-016-0258-6 · Zbl 1414.62205
[2] Anderson, JA; Pemberton, JD, The grouped continuous model for multivariate ordered categorical variables and covariate adjustment, Biometrics, 41, 875-885, (1985) · Zbl 0615.62065
[3] Baker, SG, A simple method for computing the observed information matrix when using the EM algorithm, J. Comput. Gr. Stat., 10, 63-76, (1992)
[4] Barnard, J.; McCulloch, RE; Meng, XL, Modeling covariance matrices in terms of standard deviations and correlations, with application to shrinkage, Statistica Sinica, 10, 1281-1311, (2000) · Zbl 0980.62045
[5] Bartholomew, D.J., Knott, M., Moustaki, I.: Latent Variable Models and Factor Analysis: A Unified Approach, 3rd edn. Wiley, London (2011) · Zbl 1266.62040
[6] Bianconcini, S.; Cagnone, S., Multivariate latent growth models for mixed data with covariate effects, Commun. Stat. Theory Methods, 41, 3079-3093, (2012) · Zbl 1296.62119
[7] Cagnone, S.; Viroli, C., A factor mixture analysis model for multivariate binary data, Stat. Model., 12, 257-277, (2012)
[8] Cagnone, S.; Viroli, C., A factor mixture model for analyzing heterogeneity and cognitive structure of dementia, Adv. Stat. Anal., 98, 1-20, (2014) · Zbl 1443.62408
[9] Cai, JH; Song, XY; Lam, KH; Ip, HS, A mixture of generalized latent variable models for mixed mode and heterogeneous data, Comput. Stat. Data Anal., 55, 2889-2907, (2011) · Zbl 1218.62012
[10] Daniels, MJ; Normand, SLT, Longitudinal profiling of health care units based on continuous and discrete patient outcomes, Biostatistics, 7, 1-15, (2006) · Zbl 1169.62370
[11] Leon, AR, Pairwise likelihood approach to grouped continuous model and its extension, Stat. Probab. Lett., 75, 49-57, (2005) · Zbl 1080.62039
[12] Leon, AR; Carrière, KC, General mixed-data model: extension of general location and grouped continuous models, Can. J. Stat., 35, 533-548, (2007) · Zbl 1143.62323
[13] Dempster, AP; Laird, NM; Rubin, DB, Maximum likelihood from incomplete data via the EM algorithm, J. R. Stat. Soc. Ser. B, 39, 1-38, (1977) · Zbl 0364.62022
[14] Dunson, DB, Bayesian latent variable models for clustered mixed outcomes, J. R. Stat. Soc. Ser. B, 62, 355-366, (2000)
[15] Dunson, DB; Herring, AH, Bayesian latent variable models for mixed discrete outcomes, Biostatistics, 6, 11-25, (2005) · Zbl 1069.62094
[16] Epstein, O.; Jain, S.; Lee, R.; etal., D-penicillamine treatment improves survival in primary biliary cirrhosis, Lancet, 1, 1275-1277, (1981)
[17] Fleming, T.R., Harrington, D.P.: Counting Processes and Survival Analysis. Wiley, New York (1991) · Zbl 0727.62096
[18] Gong, Y.; Klingenberg, SL; Gluud, C., Systematic review and meta-analysis: D-Penicillamine vs. placebo/no intervention in patients with primary biliary cirrhosis-Cochrane Hepato-Biliary Group, Aliment. pharmacol. ther., 24, 1535-1544, (2006)
[19] Hohenester, S.; Oude-Elferink, RP; Beuers, U., Primary biliary cirrhosis, Semin. Immunopathol., 31, 283-307, (2009)
[20] Huber, P.; Ronchetti, E.; Victoria-Feser, MP, Estimation of generalized linear latent variable models, J. R. Stat. Soc. Ser. B, 66, 893-908, (2004) · Zbl 1060.62077
[21] Jamshidian, M., Acceleration of the EM algorithm by using quasi-Newton methods, J. R. Stat. Soc. Ser. B, 59, 569-587, (1997) · Zbl 0889.62042
[22] Jamshidian, M.; Jennrich, RI, Standard errors for EM estimation, J. R. Stat. Soc. Ser. B, 62, 257-270, (2000)
[23] Kang, J., Yang, Y.: Joint modeling of mixed count and continuous longitudinal data. In: de Leon, A.R., Carrière, K.C., (eds.) Analysis of Mixed Data: Methods & Applications, pp. 63-79. Chapman and Hall/CRC, Boca Raton, FL (2013)
[24] Lammers, WJ; Kowdley, KV; Buuren, HR, Predicting outcome in primary biliary cirrhosis, Ann. Hepatol., 13, 316-326, (2014)
[25] Little, R.J., Rubin, D.B.: Statistical Analysis with Missing Data, 2nd edn. Wiley, New York (2002) · Zbl 1011.62004
[26] Little, RJA; Schluchter, MD, Maximum likelihood estimation for mixed continuous and categorical data with missing values, Biometrika, 72, 492-512, (1985) · Zbl 0609.62082
[27] Liu, X.; Daniels, MJ; Marcus, B., Joint models for the association of longitudinal binary and continuous processes with application to a smoking cessation trial, J. Am. Stat. Assoc., 104, 429-438, (2009) · Zbl 1388.62328
[28] Liu, C.; Rubin, DB, Ellipsoidally symmetric extensions of the general location model for mixed categorical and continuous data, Biometrika, 85, 673-688, (1998) · Zbl 0954.62071
[29] Louis, TA, Finding the observed information matrix when using the EM algorithm, J. R. Stat. Soc. Ser. B, 44, 226-233, (1982) · Zbl 0488.62018
[30] McLachlan, G.J., Peel, D.: Finite Mixture Models. Wiley, New York (2000) · Zbl 0963.62061
[31] McLachlan, G.J., Krishnan, T.: The EM Algorithm and Extensions, 2nd edn. Wiley, New Jersey (2008) · Zbl 1165.62019
[32] Meng, XL; Rubin, DB, Using EM to obtain asymptotic variance covariance matrices: the SEM algorithm, J. Am. Stat. Assoc., 86, 899-909, (1991)
[33] Mengersen, K.L., Robert, C.P., Titterington, D.M.: Mixtures, Estimation and Applications. Wiley, London (2011) · Zbl 1218.62003
[34] Montanari, A.; Viroli, C., Heteroscedastic factor mixture analysis, Stat. Model., 10, 441-460, (2010)
[35] Moustaki, I.; Knott, M., Generalized latent trait models, Psychometrika, 65, 391-411, (2000) · Zbl 1291.62236
[36] Muthén, B.; Shedden, K., Finite mixture modeling with mixture outcomes using the EM algorithm, Biometrics, 55, 463-469, (1999) · Zbl 1059.62599
[37] Muthén, B.; Lubke, GH, Investigating population heterogeneity with factor mixture models, Psychol. Methods, 1, 21-39, (2005)
[38] Olkin, I.; Tate, RF, Multivariate correlation models with mixed discrete and continuous variables, Ann. Math. Stat., 32, 448-465, (1961) · Zbl 0113.35101
[39] Poon, WY; Lee, SY, Maximum likelihood estimation of multivariate polyserial and polychoric correlation coefficients, Psychometrika, 52, 409-430, (1987) · Zbl 0627.62060
[40] Sammel, MD; Ryan, LM; Legler, JM, Latent variable models for mixed discrete and continuous outcomes, J. R. Stat. Soc. Ser. B, 59, 667-678, (1997) · Zbl 0889.62043
[41] Schwarz, G., Estimating the dimension of a model, Ann. Stat., 6, 461-464, (1978) · Zbl 0379.62005
[42] Yung, YF, Finite mixtures in confirmatory factor-analysis models, Psychometrika, 62, 297-330, (1997) · Zbl 0890.62047
[43] Zhang, X.; Boscardin, WJ; Belin, TR; Wan, X.; He, Y.; Zhang, K., A Bayesian method for analyzing combinations of continuous, ordinal, and nominal categorical data with missing values, J. Multivar. Anal., 135, 43-58, (2015) · Zbl 1329.62038
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.