×

A generalized mixture model applied to diabetes incidence data. (English) Zbl 1369.62310

Summary: We present a generalization of the usual (independent) mixture model to accommodate a Markovian first-order mixing distribution. We propose the data-driven reversible jump, a Markov chain Monte Carlo (MCMC) procedure, for estimating the a posteriori probability for each model in a model selection procedure and estimating the corresponding parameters. Simulated datasets show excellent performance of the proposed method in the convergence, model selection, and precision of parameters estimates. Finally, we apply the proposed method to analyze USA diabetes incidence datasets.

MSC:

62P10 Applications of statistics to biology and medical sciences; meta analysis
62M10 Time series, auto-correlation, regression, etc. in statistics (GARCH)
65C40 Numerical analysis or methods applied to Markov chains
92D30 Epidemiology

Software:

depmix; depmixS4
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Beckett, Spectral analysis for discrete longitudinal data, Advances in Mathematics 103 pp 107– (1994) · Zbl 0805.62085 · doi:10.1006/aima.1994.1002
[2] Boys, On determining the order of Markov dependence of an observed process governed by a hidden Markov model, Scientific Programming 10 pp 241– (2002) · doi:10.1155/2002/683164
[3] Boys, A Bayesian approach to DNA sequence segmentation, Biometrics 60 pp 573– (2004) · Zbl 1274.62728 · doi:10.1111/j.0006-341X.2004.00206.x
[4] Boys, Detecting homogeneous segments in DNA sequences by using hidden Markov models, Journal of the Royal Statistical Society: Series C (Applied Statistics) 49 pp 269– (2000) · Zbl 0944.62108 · doi:10.1111/1467-9876.00191
[5] Brooks, Efficient construction of reversible jump Markov chain Monte Carlo proposal distributions, Journal of the Royal Statistical Society: Series B (Statistical Methodology) 65 pp 3– (2003) · Zbl 1063.62120 · doi:10.1111/1467-9868.03711
[6] Dahl, Bayesian Inference for Gene Expression and Proteomics (2006)
[7] Dempster, Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society. Series B (Statistical Methodology) 39 pp 1– (1977) · Zbl 0364.62022
[8] Fan, Automating and evaluating reversible jump MCMC proposal distributions, Statistics and Computing 19 pp 409– (2009) · doi:10.1007/s11222-008-9101-z
[9] Gough, Assignment of homology to genome sequences using a library of hidden Markov models that represent all proteins of known structure, Journal of Molecular Biology 313 pp 903– (2001) · doi:10.1006/jmbi.2001.5080
[10] Green, Reversible jump Markov chain Monte Carlo computation and Bayesian model determination, Biometrika 82 pp 711– (1995) · Zbl 0861.62023 · doi:10.1093/biomet/82.4.711
[11] Green, Delayed rejection in reversible jump Metropolis-Hastings, Biometrika 88 pp 1035– (2001) · Zbl 1099.60508 · doi:10.1093/biomet/88.4.1035
[12] Green, Modelling heterogeneity with and without the Dirichlet process, Scandinavian Journal of Statistics 28 pp 355– (2001) · Zbl 0973.62031 · doi:10.1111/1467-9469.00242
[13] Jackson, Multistate Markov models for disease progression with classification error, Journal of the Royal Statistical Society: Series D (The Statistician) 52 pp 193– (2003)
[14] Jain, A split-merge Markov chain Monte Carlo procedure for the Dirichlet process mixture model, Journal of Computational and Graphical Statistics 13 pp 158– (2004) · doi:10.1198/1061860043001
[15] Jain, Splitting and merging components of a nonconjugate Dirichlet process mixture model, Bayesian Analysis 2 pp 445– (2007) · Zbl 1331.62145 · doi:10.1214/07-BA219
[16] Kass, Markov chain Monte Carlo in practice: a roundtable discussion, The American Statistician 52 pp 93– (1998)
[17] Kranz, Comparative Epidemiology of Plant Diseases (2003) · doi:10.1007/978-3-662-05261-7
[18] Lee, A 9-state hidden Markov model using protein secondary structure information for protein fold recognition, Computers in Biology and Medicine 39 pp 527– (2009) · doi:10.1016/j.compbiomed.2009.03.008
[19] Liu, Nonparametric hierarchical Bayes via sequential imputations, The Annals of Statistics 24 pp 911– (1996) · Zbl 0880.62038 · doi:10.1214/aos/1032526949
[20] Madden, Plant disease incidence: distributions, heterogeneity, and temporal analysis, Annual Review of Phytopathology 33 pp 529– (1995) · doi:10.1146/annurev.py.33.090195.002525
[21] McLachlan, Finite Mixture Models (2004)
[22] Muri, Compstat’98 Proceedings in Computational Statistics pp 89– (1998)
[23] Newton, Practical nonparametric and semiparametric Bayesian statistics pp 45– (1998) · doi:10.1007/978-1-4612-1732-9_3
[24] Pandolfi, A generalized multiple-try version of the reversible jump algorithm, Computational Statistics and Data Analysis 72 pp 298– (2014) · Zbl 1506.62146 · doi:10.1016/j.csda.2013.10.007
[25] Rabiner, A tutorial on hidden Markov models and selected applications in speech recognition, Proceedings of the IEEE 77 pp 257– (1989) · doi:10.1109/5.18626
[26] Rabiner, An introduction to hidden Markov models, ASSP Magazine, IEEE 3 pp 4– (1986) · doi:10.1109/MASSP.1986.1165342
[27] Richardson, On bayesian analysis of mixtures with an unknown number of components (with discussion), Journal of the Royal Statistical Society: Series B (Statistical Methodology) 59 pp 731– (1997) · Zbl 0891.62020 · doi:10.1111/1467-9868.00095
[28] Robert, Bayesian inference in hidden Markov models through the reversible jump Markov chain Monte Carlo method, Journal of the Royal Statistical Society: Series B (Statistical Methodology) 62 pp 57– (2000) · Zbl 0941.62090 · doi:10.1111/1467-9868.00219
[29] Rothman, Modern Epidemiology (2008)
[30] Saraiva, Clustering gene expression data using a posterior split-merge-birth procedure, Scandinavian Journal of Statistics 39 pp 399– (2012) · Zbl 1323.60097 · doi:10.1111/j.1467-9469.2011.00765.x
[31] Shi , J. Murray-Smith , R. Titterington , D. 2002 Birth-death MCMC methods for mixtures with an unknown number of components Technical Report University of Glasgow Glasgow, UK
[32] Söding, Protein homology detection by HMM-HMM comparison, Bioinformatics 21 pp 951– (2005) · doi:10.1093/bioinformatics/bti125
[33] Spezia, Bayesian analysis of multivariate Gaussian hidden Markov models with an unknown number of regimes, Journal of Time Series Analysis 31 pp 1– (2010) · Zbl 1222.62110 · doi:10.1111/j.1467-9892.2009.00635.x
[34] Tierney, Some adaptive Monte Carlo methods for Bayesian inference, Statistics in Medicine 18 pp 2507– (1999) · doi:10.1002/(SICI)1097-0258(19990915/30)18:17/18<2507::AID-SIM272>3.0.CO;2-J
[35] Visser, depmixS4: An R-package for hidden Markov models, Journal of Statistical Software 36 pp 1– (2010) · doi:10.18637/jss.v036.i07
[36] Zucchini, Hidden Markov Models for Time Series: An Introduction Using R (2009) · Zbl 1180.62130 · doi:10.1201/9781420010893
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.