Item selection by latent class-based methods: an application to nursing home evaluation. (English) Zbl 1414.62508

Summary: The evaluation of nursing homes is usually based on the administration of questionnaires made of a large number of polytomous items to their patients. In such a context, the latent class model represents a useful tool for clustering subjects in homogenous groups corresponding to different degrees of impairment of the health conditions. It is known that the performance of model-based clustering and the accuracy of the choice of the number of latent classes may be affected by the presence of irrelevant or noise variables. In this paper, we show the application of an item selection algorithm to a dataset collected within a project, named ULISSE, on the quality-of-life of elderly patients hosted in Italian nursing homes. This algorithm, which is closely related to that proposed by Dean and Raftery in 2010, is aimed at finding the subset of items which provides the best clustering according to the Bayesian Information Criterion. At the same time, it allows us to select the optimal number of latent classes. Given the complexity of the ULISSE study, we perform a validation of the results by means of a sensitivity analysis, with respect to different specifications of the initial subset of items, and of a resampling procedure.


62P25 Applications of statistics to social sciences
62H30 Classification and discrimination; cluster analysis (statistical aspects)


Full Text: DOI arXiv


[1] Bacci, S.; Bartolucci, F.; Gnaldi, M., A class of multidimensional latent class IRT models for ordinal polytomous item responses, Commun Stat Theory Methods, 43, 787-800, (2014) · Zbl 1462.62400
[2] Bandeen-Roche, K.; Miglioretti, DL; Zeger, SL; Rathouz, PJ, Latent variable regression for multiple discrete outcomes, J Am Stat Assoc, 92, 1375-1386, (1997) · Zbl 0912.62121
[3] Bandeen-Roche, K.; Xue, QL; Ferrucci, L.; Walston, J.; Guralnik, JM; Chaves, P.; Zeger, SL; Fried, LP, Phenotype of frailty: characterization in the women’s health and aging studies, J Gerontol Ser A Biol Sci Med Sci, 61, 262-266, (2006)
[4] Biernacki, C.; Celeux, G.; Govaert, G., Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models, Comput Stat Data Anal, 41, 561-575, (2003) · Zbl 1429.62235
[5] Breyer, F.; Costa-Font, J.; Felder, S., Ageing, health, and health care, Oxf Rev Econ Policy, 26, 674-690, (2010)
[6] Dean, N.; Raftery, AE, Latent class analysis variable selection, Ann Inst Stat Math, 62, 11-35, (2010) · Zbl 1422.62085
[7] Dempster, AP; Laird, NM; Rubin, DB, Maximum likelihood from incomplete data via the EM algorithm, J R Stat Soc Ser B, 39, 1-38, (1977) · Zbl 0364.62022
[8] Fraley, C.; Raftery, AE, Model-based clustering, discriminant analysis, and density estimation, J Am Stat Assoc, 97, 611-631, (2002) · Zbl 1073.62545
[9] Galasso, V.; Profeta, P., How does ageing affect the welfare state?, Eur J Polit Econ, 23, 554-563, (2007)
[10] Goodman, LA, Exploratory latent structure analysis using both identifiable and unidentifiable models, Biometrika, 61, 215-231, (1974) · Zbl 0281.62057
[11] Harel, O.; Schafer, JL, Partial and latent ignorability in missing-data problems, Biometrika, 96, 37-50, (2009) · Zbl 1162.62095
[12] Hawes, CH; Morris, JN; Phillips, CD; Fries, BE; Murphy, K.; Mor, V., Development of the nursing home resident assessment instrument in the USA, Age Agening, 26, 19-25, (1997)
[13] Karlis, D.; Xekalaki, E., Choosing initial values for the EM algorithm for finite mixtures, Comput Stat Data Anal, 41, 577-590, (2003) · Zbl 1429.62082
[14] Kass, R.; Raftery, AE, Bayes factors, J Am Stat Assoc, 90, 773-795, (1995) · Zbl 0846.62028
[15] Kohler, HP; Billardi, FC; Ortega, JA, The emergence of lowest-low fertility in Europe during the 1990s, Popul Dev Rev, 28, 641-680, (2002)
[16] Lafortune, L.; Beland, F.; Bergman, H.; Ankri, J., Health status transitions in community-living elderly with complex care needs: a latent class approach, BMC Geriatr, 9, 6, (2009)
[17] Lattanzio, F.; Mussi, C.; Scafato, E.; Ruggiero, C.; Dell’Aquila, G.; Pedone, C.; Mammarella, F.; Galluzzo, L.; Salvioli, G.; Senin, U.; Carbonin, PU; Bernabei, R.; Cherubini, A., Health care for older people in Italy: The U.L.I.S.S.E. project (un link informatico sui servizi sanitari esistenti per l’anziano—a computerized network on health care services for older people), J Nutr Health Aging, 14, 238-242, (2010)
[18] Lazarsfeld PF (1950) The logical and mathematical foundation of latent structure analysis. In: Stouffer SA, Suchman EA, Guttman L (eds) Measurement and prediction. Princeton University Press, New York
[19] Lazarsfeld PF, Henry NW (1968) Latent structure analysis. Houghton Mifflin, Boston · Zbl 0182.52201
[20] Little RJA, Rubin DB (2002) Statistical analysis with missing data, 2nd edn. Wiley Series in Probability and Statistics, Wiley
[21] Lu, G.; Copas, JB, Missing at random, likelihood ignorability and model completeness, Ann Stat, 32, 754-765, (2004) · Zbl 1048.62007
[22] Magidson, J.; Vermunt, JK, Latent class factor and cluster models, bi-plots and related graphical displays, Sociol Methodol, 31, 223-264, (2001)
[23] Moran, M.; Walsh, C.; Lynch, A.; Coen, RF; Coakley, D.; Lawlor, BA, Syndromes of behavioural and psychological symptoms in mild Alzheimer’s disease, Int J Geriatr Psychiatry, 19, 359-364, (2004)
[24] Morris J, Hawes C, Murphy K et al (1991) Resident assessment instrument training manual and resource guide. Eliot Press, Natick
[25] Rubin, DB, Inference and missing data, Biometrika, 63, 581-592, (1976) · Zbl 0344.62034
[26] Samejima F (1969) Estimation of latent ability using a response pattern of graded scores. Psychometrika: monograph supplement, vol 17. Psychometric Society, Richmond, p i-169
[27] Samejima, F., Evaluation of mathematical models for ordered polychotomous responses, Behaviormetrika, 23, 17-35, (1996)
[28] Schwarz, G., Estimating the dimension of a model, Ann Stat, 6, 461-464, (1978) · Zbl 0379.62005
[29] Vermunt, JK; Magidson, J.; Hagenaars, JA (ed.); McCutcheon, AL (ed.), Latent class cluster analysis, (2002), Cambridge
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.