zbMATH — the first resource for mathematics

Describing disability through individual-level mixture models for multivariate binary data. (English) Zbl 1126.62101
Summary: Data on functional disability are of widespread policy interest in the United States, especially with respect to planning for Medicare and Social Security for a growing population of elderly adults. We consider an extract of functional disability data from the National Long Term Care Survey (NLTCS) and attempt to develop disability profiles using variations of the Grade of Membership (GoM) model. We first describe GoM as an individual-level mixture model that allows individuals to have partial membership in several mixture components simultaneously. We then prove the equivalence between individual-level and population-level mixture models, and use this property to develop a Markov Chain Monte Carlo algorithm for Bayesian estimation of the model. We use our approach to analyze functional disability data from the NLTCS.

62P10 Applications of statistics to biology and medical sciences; meta analysis
62F15 Bayesian inference
65C40 Numerical analysis or methods applied to Markov chains
62P25 Applications of statistics to social sciences
62N02 Estimation in survival analysis and censored data
Full Text: DOI Euclid
[1] Airoldi, E., Fienberg, S. E., Joutard, C. and Love, T. (2007). Discovering latent patterns with hierarchical Bayesian mixed-membership models. In Data Mining Patterns : New Methods and Applications (P. Poncelet, F. Masseglia and M. Teisseire, eds.) 240-275. Idea Group Inc., Hershey, PA.
[2] Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In Second International Symposium on Information Theory (B. N. Petrov and F. Csaki, eds.) 267-281. Akadémiai Kiadó, Budapest. · Zbl 0283.62006
[3] Berkman, L., Singer, B. and Manton, K. G. (1989). Black/white differences in health status and mortality among the elderly. Demography 26 661-678.
[4] Best, N., Cowles, M. K.and Vines, K. (1996). CODA: Convergence diagnosis and output analysis software for Gibbs sampling output (version 0.30). Technical report, MRC Cambridge, UK.
[5] Bishop, Y. M. M., Fienberg, S. E. and Holland, P. W. (1975). Discrete Multivariate Analysis : Theory and Practice . MIT Press, Cambridge, MA. · Zbl 0332.62039
[6] Blei, D. M., Jordan, M. I. and Ng, A. Y. (2003). Latent Dirichlet allocation. J. Machine Learning Research 3 993-1022. · Zbl 1112.68379 · doi:10.1162/jmlr.2003.3.4-5.993
[7] Blei, D. M., Ng, A. Y. and Jordan, M. I. (2001). Latent Dirichlet allocation. Adv. in Neural Information Processing Systems 14 . · Zbl 1112.68379
[8] Blumen, J., Kogan, M. and Holland, P. (1955). The Industrial Mobility of Labor as a Probability Process . Cornell Univ. Press, Ithaca, New York.
[9] Celeux, G. and Soromenho, G. (1996). An entropy criterion for assessing the number of clusters in a mixture model. Classification J. 13 195-212. · Zbl 0861.62051 · doi:10.1007/BF01246098
[10] Corder, L. S. and Manton, K. G. (1991). National surveys and the health and functioning of the elderly: The effects of design and content. J. Amer. Statist. Assoc. 86 513-525.
[11] Cutler, D. M. (2001). Commentary on “changes in the prevalence of chronic disability in the United States black and nonblack population above age 65 from 1982 to 1999” by K. G. Manton and X. Gu. Proc. Natl. Acad. Sci. 98 6546-6547.
[12] Davidson, J. R. T., Woodbury, M. A., Zisook, S. and Giller, E. L. (1989). Classification of depression by grade of membership: A confirmation study. Psychological Medicine 19 987-998.
[13] Erosheva, E. A. (2002). Grade of membership and latent structure models with application to disability survey data. Ph.D. thesis, Carnegie Mellon Univ.
[14] Erosheva, E. A. (2003). Bayesian estimation of the grade of membership model. In Bayesian Statistics 7 (J. M. Bernardo, M. J. Bayarri, J. O. Berger, A. P. Dawid, D. Heckerman, A. F. M. Smith, and M. West, eds.) 501-510. Oxford Univ. Press.
[15] Erosheva, E. A. (2005). Comparing latent structures of the grade of membership, Rasch and latent class models. Psychometrika 70 619-626. · Zbl 1306.62409 · doi:10.1007/s11336-001-0899-y
[16] Erosheva, E. A. (2006). Latent class representation of the grade of membership model. Technical Report 492, Dept. Statistics, Univ. Washington.
[17] Erosheva, E. A. and Fienberg, S. E. (2005). Bayesian mixed membership models for soft clustering and classification. In Classification–The Ubiquitous Challenge (C. Weihs and W. Gaul, eds.) 11-26. Springer, New York.
[18] Erosheva, E. A., Fienberg, S. E. and Lafferty, J. (2004). Mixed-membership models of scientific publications. Proc. Natl. Acad. Sci. 101 ( Suppl. 1 ) 5220-5227.
[19] Erosheva, E. A. and White, T. (2006). Operational definition of chronic disability in the National Long Term Care Survey: Problems and suggestions. Technical report, Dept. Statistics, Univ. Washington.
[20] Foody, G. M., Campbell, N. A., Trodd, N. M. and Wood, T. F. (1992). Derivation and applications of probabilistic measures of class membership from the maximum-likelihood classification. Photogrammetric Engineering and Remote Sensing 58 1335-1341.
[21] Gill, T. M., Hardy, S. E. and Williams, C. S. (2002). Underestimation of disability in community-living older persons. J. American Geriatric Society 50 1492-1497.
[22] Gill, T. M. and Kurland, B. (2003). The burden and patterns of disability in activities of daily living among community-living older persons. J. Gerontology 58 70-75.
[23] Goodman, L. A. (1974). Exploratory latent structure analysis using both identifiable and unidentifiable models. Biometrika 61 215-231. · Zbl 0281.62057 · doi:10.1093/biomet/61.2.215
[24] Green, P. J. (1995). Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82 711-732. · Zbl 0861.62023 · doi:10.1093/biomet/82.4.711
[25] Griffiths, T. L. and Steyvers, M. (2004). Finding scientific topics. Proc. Natl. Acad. Sci. 101 ( Suppl. 1 ) 5228-5235.
[26] Haberman, S. J. (1995). Book review of “Statistical applications using fuzzy sets,” by K. G. Manton, M. A. Woodbury and H. D. Tolley. J. Amer. Statist. Assoc. 90 1131-1133.
[27] Hastie, T., Tibshirani, R. and Friedman, J. H. (2001). The Elements of Statistical Learning : Data Mining , Inference , and Prediction . Springer, New York. · Zbl 0973.62007
[28] Holland, P. W. and Rosenbaum, P. R. (1986). Conditional association and unidimensionality in monotone latent variable models. Ann. Satist. 14 1523-1543. · Zbl 0625.62102 · doi:10.1214/aos/1176350174
[29] Jordan, M., Ghahramani, Z., Jaakkola, T. and Saul, L. (1999). Introduction to variational methods for graphical models. Machine Learning 37 183-233. · Zbl 1033.68081 · doi:10.1023/A:1020281327116
[30] Kass, R. E. and Raftery, A. E. (1995). Bayes factors. J. Amer. Statist. Assoc. 90 773-795. · Zbl 0846.62028 · doi:10.2307/2291091
[31] Lambert, D. (1992). Zero-inflated Poisson regression, with an application to defects in manufacturing. Technometrics 34 1-14. · Zbl 0850.62756 · doi:10.2307/1269547
[32] Lazarsfeld, P. F. and Henry, N. W. (1968). Latent Structure Analysis . Houghton Mifflin, Boston. · Zbl 0182.52201
[33] Leroux, B. G. (1992). Consistent estimation of a mixing distribution. Ann. Statist. 20 1350-1360. · Zbl 0763.62015 · doi:10.1214/aos/1176348772
[34] Lynch, S. M., Brown, S. J. and Harmsen, K. G. (2003). The effect of altering ADL thresholds on active life expectancy estimates for older persons. J. Gerontology : Social Sciences 58 S171-S178.
[35] Manton, K. G. (1988). A longitudinal study of functional change and mortality in the United States. J. Gerontology : Social Sciences 43 153-161.
[36] Manton, K. G., Corder, L. and Stallard, E. (1993). Estimates of change in chronic disability and institutional incidence and prevalence rate in the U.S. elderly populations from 1982 to 1989. J. Gerontology : Social Sciences 48 S153-S166.
[37] Manton, K. G., Corder, L. and Stallard, E. (1997). Chronic disability trends in elderly United States populations: 1982-1994. Proc. Natl. Acad. Sci. 94 2593-2598.
[38] Manton, K. G. and Gu, X. (2001). Changes in the prevalence of chronic disability in the United States black and nonblack population above age 65 from 1982 to 1999. Proc. Natl. Acad. Sci. 98 6354-6359.
[39] Manton, K. G., Gu, X., Huang, H. and Kovtun, M. (2004). Fuzzy set analyses of genetic determinants of health and disability status. Statistical Methods in Medical Research 13 395-408. · Zbl 1053.62123 · doi:10.1191/0962280204sm374ra
[40] Manton, K. G., Gu, X. and Lamb, V. L. (2006a). Long-term trends in life expectancy and active life expectancy in the United States. Population Development Review 32 81-105.
[41] Manton, K. G., Gu, X. and Lamb, V. L. (2006b). Change in chronic disability from 1982 to 2004/2005 as measured by long-term changes in function and health in the U.S. elderly population. Proc. Natl. Acad. Sci. 103 18374-18379.
[42] Manton, K. G., Woodbury, M. A., Anker, M. and Jablensky, A. (1994). Symptom profiles of psychiatric disorders based on graded disease classes: An illustration using data from the WHO International Pilot Study of Schizophrenia. Psychological Medicine 24 133-144.
[43] Manton, K. G., Woodbury, M. A. and Tolley, H. D. (1994). Statistical Applications Using Fuzzy Sets . Wiley, New York. · Zbl 0811.62003
[44] McLachlan, G. J. and Peel, D. (2000). Finite Mixture Models . Wiley, New York. · Zbl 0963.62061
[45] Muthen, B. and Shedden, K. (1999). Finite mixture modeling with mixture outcomes using the EM algorithm. Biometrics 55 463-469. · Zbl 1059.62599 · doi:10.1111/j.0006-341X.1999.00463.x
[46] Pearson, K. (1900). On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. Philosophical Magazine Series 50 157-175. · JFM 31.0238.04
[47] Pelleg, D. and Moore, A. W. (2000). X-means: Extending \(K\)-means with efficient estimation of the number of clusters. International Conference on Machine Learning 17 727-734.
[48] Potthoff, R. G., Manton, K. G., Woodbury, M. A. and Tolley, H. D. (2000). Dirichlet generalizations of latent-class models. J. Classification 17 315-353. · Zbl 1017.62127 · doi:10.1007/s003570000024
[49] Pritchard, J. K., Stephens, M. and Donnelly, P. (2000). Inference of population structure using multilocus genotype data. Genetics 155 945-959. · Zbl 1083.62537
[50] Raftery, A. E., Newton, M. A., Satagopan, J. M. and Krivitsky, P. N. (2007). Estimating the integrated likelihood via posterior simulation using the harmonic mean estimating the integrated likelihood via posterior simulation using the harmonic mean identity. In Bayesian Statistics 8 (J. M. Bernardo, M. J. Bayarri, J. O. Berger, A. P. Dawid, D. Heckerman, A. F. M. Smith and M. West, eds.). Oxford Univ. Press. · Zbl 1252.62038
[51] Rasch, G. (1960). Probabilistic Models for some Intelligence and Attainment Tests . Neisen and Lydiche, Copenhagen; expanded English edition (1980), Univ. Chicago Press, Chicago.
[52] Reboussin, B. A., Reboussin, D. M., Liang, K. Y. and Anthony, J. C. (1998). Latent transition modelling of progression of health-risk behavior. Multivariate Behavioral Research 33 457-478.
[53] Roeder, K., Lynch, K. G. and Nagin, D. S. (1999). Modeling uncertainty in latent class membership: A case study in criminology. J. Amer. Statist. Assoc. 94 766-776.
[54] Roeder, K. and Wasserman, L. (1997). Practical density estimation using mixtures of normals. J. Amer. Statist. Assoc. 92 894-902. · Zbl 0889.62021 · doi:10.2307/2965553
[55] Schwartz, G. (1978). Estimating the dimension of a model. Ann. Statist. 6 461-464. · Zbl 0379.62005 · doi:10.1214/aos/1176344136
[56] Singer, B. (1989). Grade of membership representations: Concepts and problems. In Probability , Statistics , and Mathematics : Papers in Honor of Samuel Karlin 317-334. Academic Press, Boston, MA. · Zbl 0683.62065
[57] Singer, B. H. and Manton, K. G. (1998). The effects of health changes on projections of health service needs for the elderly population of the United States. Proc. Natl. Acad. Sci. 95 15618-15622.
[58] Spector, W. D. and Fleishman, J. A. (1998). Combining activities of daily living with instrumental activities of daily living to measure functional disability. J. Gerontology : Social Sciences 53 S46-S57.
[59] Spiegelhalter, D., Thomas, A., Best, N. and Gilks, W. (1996). BUGS 0.5: Bayesian inference using Gibbs sampling manual (version ii). Technical report, MRC Cambridge, UK.
[60] Spiegelhalter, D. J., Best, N. G., Carlin, B. P. and Van der Linde, A. (2002). Bayesian measures of model complexity and fit. J. Roy. Statist. Soc. Ser. B 64 583-639. · Zbl 1067.62010 · doi:10.1111/1467-9868.00353
[61] Stephens, M. (2000). Dealing with label switching in mixture models. J. Roy. Statist. Soc. Ser. B 62 795-809. · Zbl 0957.62020 · doi:10.1111/1467-9868.00265
[62] Tanner, M. A. (1996). Tools for Statistical Inference. Methods for the Exploration of Posterior Distributions and Likelihood Functions , 3rd ed. Springer, New York. · Zbl 0846.62001
[63] Wachter, K. W. (1999). Grade of membership models in low dimensions. Statistical Papers 40 439-457. · Zbl 0938.62066 · doi:10.1007/BF02934635
[64] Woodbury, M. A., Clive, J. and Garson, A. (1978). Mathematical typology: A grade of membership technique for obtaining disease definition. Computers and Biomedical Research 11 277-298.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.