A Bayesian model for repeated measures zero-inflated count data with application to outpatient psychiatric service use.

*(English)*Zbl 07256832Summary: In applications involving count data, it is common to encounter an excess number of zeros. For example, in the study of outpatient service utilization, the number of utilization days will take on integer values, with many subjects having no utilization (zero values). Mixed distribution models, such as the zero-inflated Poisson and zero-inflated negative binomial, are often used to fit such data. A more general class of mixture models, called hurdle models, can be used to model zero deflation as well as zero inflation. Several authors have proposed frequentist approaches to fitting zero-inflated models for repeated measures. We describe a practical Bayesian approach which incorporates prior information, has optimal small-sample properties and allows for tractable inference. The approach can be easily implemented using standard Bayesian software. A study of psychiatric outpatient service use illustrates the methods.

##### MSC:

62 | Statistics |

##### Keywords:

Bayesian inference; hurdle model; repeated measures; zero-altered model; zero-inflated model
PDF
BibTeX
XML
Cite

\textit{B. H. Neelon} et al., Stat. Model. 10, No. 4, 421--439 (2010; Zbl 07256832)

Full Text:
DOI

##### References:

[1] | Berger J and Pericchi L (1996) The intrinsic Bayes factor for model selection and prediction . Journal of the American Statistical Association, 91, 109-22 . · Zbl 0870.62021 |

[2] | Brooks SP and Gelman A (1998) General methods for monitoring convergence of iterative simulations . Journal of Computational and Graphical Statistics, 7, 434-55 . |

[3] | Browne WJ and Draper D (2000) Implementation and performance issues in the Bayesian and likelihood fitting of multilevel models . Computational Statistics, 15, 391-420 . · Zbl 1037.62013 |

[4] | Celeux G , Forbes F , Robert CP and Titterington DM (2006) Deviance information criteria for missing data models . Bayesian Analysis 1, 651-74 . · Zbl 1331.62329 |

[5] | Congdon P (2005) Bayesian models for categorical data. Chichester: John Wiley & Sons . · Zbl 1079.62036 |

[6] | Cooper NJ , Sutton AJ , Mugford M and Abrams KR (2003) Use of Bayesian Markov chain Monte Carlo methods to model cost-of-illness data . Medical Decision Making, 23, 38-53 . |

[7] | Cooper NJ , Lambert PC , Abrams KR and Sutton AJ (2007) Predicting costs over time using Bayesian Markov chain Monte Carlo methods: an application to early inflammatory polyarthritis . Health Economics, 16, 37-56 . |

[8] | Dalrymple ML , Hudson IL and Ford RPK (2003) Finite mixture, zero-inflated Poisson and hurdle models with application to SIDS . Computational Statistics and Data Analysis, 41, 491-504 . · Zbl 1429.62513 |

[9] | Dey DK , Chen M-H and Chang H (1997) Bayesian approach for nonlinear random effect models . Biometrics, 53, 1239-52 . · Zbl 0911.62024 |

[10] | Duan N (1983) Smearing estimate: a nonparametric retransformation method . Journal of the American Statistical Association, 78, 605-10 . · Zbl 0534.62021 |

[11] | Elliott MR , Gallo JJ , Ten Have TR , Bogner HR , et al. (2005) Using a Bayesian latent growth curve model to identify trajectories of positive affect and negative events following myocardial infarction . Biostatistics, 6, 119-43 . · Zbl 1069.62095 |

[12] | Fahrmeir L and Osuna Echavarría L (2006) Structured additive regression for overdispersed and zero-inflated count data . Applied Stochastic Models in Business and Industry, 22, 351-69 . · Zbl 1114.62023 |

[13] | Gamerman D (1997) Efficient sampling from the posterior distribution in generalized linear models . Statistics and Computing, 7, 57-68 . |

[14] | Geisser S and Eddy W (1979) A predictive approach to model selection . Journal of the American Statistical Association, 74, 153-60 . · Zbl 0401.62036 |

[15] | Gelfand AE (1996) Model determination using sampling-based methods. In Gilks WR , Richardson S and Spiegelhalter DJ (eds), Markov chain Monte Carlo in practice. London: Chapman & Hall , 145-60. · Zbl 0840.62003 |

[16] | Gelfand AE and Dey D (1994) Bayesian model choice: asymptotics and exact calculations . Journal of the Royal Statistical Society, Series B, 56, 501-14 . · Zbl 0800.62170 |

[17] | Gelman A (2006) Prior distributions for variance parameters in hierarchical models . Bayesian Analysis, 1, 515-33 . · Zbl 1331.62139 |

[18] | Gelman A , Meng XL and Stern H (1996) Posterior predictive assessment of model fitness via realized discrepancies . Statistica Sinica, 6, 733-807 . · Zbl 0859.62028 |

[19] | Gelman A , Carlin JB , Stern HS and Rubin DB (2004) Bayesian data analysis (second edition). Boca Raton: Chapman & Hall . · Zbl 1039.62018 |

[20] | Geweke J (1992) Evaluating the accuracy of sampling-based approaches to calculating posterior moments. In Bernardo J , Berger JO , Dawid AP and Smith AFM (eds), Bayesian Statistics 4. Oxford: Oxford University Press , 169-94. |

[21] | Ghosh SK , Mukhopadhyay P and Lu JC (2006) Bayesian analysis of zero-inflated regression models . Journal of Statistical Planning and Inference, 136, 1360-75 . · Zbl 1088.62139 |

[22] | Gilks WR and Wild P (1992) Adaptive rejection sampling for Gibbs sampling . Applied Statistics, 41, 337-48 . · Zbl 0825.62407 |

[23] | Gilks WR , Wang CC , Yvonnet B and Coursaget P (1993) Random-effects models for longitudinal data using Gibbs sampling . Biometrics, 49, 441-53 . · Zbl 0783.62092 |

[24] | Gilks WR , Best NG and Tan KKC (1995) Adaptive rejection Metropolis sampling within Gibbs sampling . Applied Statistics, 44, 455-72 . · Zbl 0893.62110 |

[25] | Green P (1995) Reversible jump Markov chain Monte Carlo computation and Bayesian model determination . Biometrika, 82, 711-32 . · Zbl 0861.62023 |

[26] | Greene W (1994) Accounting for excess zeros and sample selection in Poisson and negative binomial regression models. Working Paper EC-94-10, Department of Economics, New York University . |

[27] | Gschlößl S and Gzado C (2008) Modelling count data with overdispersion and spatial effects . Statistical Papers, 49, 531-52 . · Zbl 1310.62083 |

[28] | Hall D (2000) Zero-inflated Poisson and binomial regression with random effects: a case study . Biometrics, 56, 1030-39 . · Zbl 1060.62535 |

[29] | Heilbron DC (1989) Generalized linear models for altered zero probabilities and overdispersion in count data. SIMS Technical Report 9, Department of Epidemiology and Biostatistics, University of California, San Francisco . |

[30] | Heilbron DC (1994) Zero-altered and other regression models for count data with added zeros . Biometrical Journal, 36, 531-47 . · Zbl 0846.62053 |

[31] | Hoeting JA , Madigan D , Raftery AE and Volinsky CT (1999) Bayesian model averaging: a tutorial . Statistical Science, 14, 382-417 . · Zbl 1059.62525 |

[32] | Kass RE and Raftery AE (1995) Bayes factors .Journal of the American Statistical Association, 90, 773-95 . · Zbl 0846.62028 |

[33] | Lambert D (1992) Zero-inflated Poisson regression, with an application to defects in manufacturing . Technometrics, 34, 1-14 . · Zbl 0850.62756 |

[34] | Lambert PC , Sutton AJ , Burton PR , Abrams KR , et al. (2005) How vague is vague? A simulation study of the impact of the use of vague prior distributions in MCMC using WinBUGS . Statistics in Medicine, 24, 2401-28 . |

[35] | Liu H and Power DA (2007) Growth curve models for zero-inflated count data: an application to smoking behavior . Structural Equation Modeling, 14, 247-79 . |

[36] | Min Y and Agresti A (2005) Random effect models for repeated measures of zero-inflated count data . Statistical Modelling, 5, 1-19 . · Zbl 1070.62060 |

[37] | Mullahy J (1986) Specification and testing of some modified count data models . Journal of Econometrics, 33, 341-65 . |

[38] | Muthén B , Brown CH , Booil Jo KM , Khoo ST , et al. (2002) General growth mixture modeling for randomized preventive interventions . Biostatistics, 3, 459-75 . · Zbl 1138.62365 |

[39] | Olsen MK and Schafer JL (2001) A two-part random-effects model for semicontinuous longitudinal data . Journal of the American Statistical Association, 96, 730-45 . · Zbl 1017.62064 |

[40] | R Development Core Team (2007) R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing . http://www.R-project.org. |

[41] | Ridout M , Demétrio CGB and Hinde J (1998) Models for count data with many zeros . Proceedings from the International Biometric Conference, Cape Town, December 1998. |

[42] | Rodrigues J (2003) Bayesian analysis of zero-inflated distributions . Communications in Statistics, Theory and Methods, 32, 281-89 . · Zbl 1024.62009 |

[43] | Roeder K , Lynch KG and Nagin DS (1999) Modeling uncertainty in latent class membership: a case study in criminology . Journal of the American Statistical Association, 94, 766-76 . |

[44] | Rose CE , Martin SW , Wannemuehler KA and Plikaytis BD (2006) On the use of zero-inflated and hurdle models for modeling vaccine adverse event count data . Journal of Biopharmaceutical Statistics, 16, 463-81 . |

[45] | Rosenheck RA , Lam JA , Morrissey JP , Calloway MO , et al. (2002) Service systems integration and outcomes for mentally ill homeless persons in the ACCESS program . Psychiatric Services, 53, 958-66 . |

[46] | Smith AFM and Gelfand AE (1992) Bayesian statistics without tears: a sampling-resampling perspective . American Statistician, 46, 84-88 . |

[47] | Smith BJ (2007) Boa: an R package for MCMC output convergence assessment and posterior inference . Journal of Statistical Software, 21, 1-37 . |

[48] | Spiegelhalter DJ (1998) Bayesian graphical modelling: a case study in monitoring health outcomes . Applied Statistics, 47, 115-33 . |

[49] | Spiegelhalter DJ , Best NG , Carlin BP and van der Linde A (2002) Bayesian measures of model complexity and fit . Journal of the Royal Statistical Society: Series B, 64, 583-39 . · Zbl 1067.62010 |

[50] | Spiegelhalter DJ , Thomas A , Best N and Lunn D (2003) WinBugs Version 1.4: User Manual. Cambridge: Medical Research Council Biostatistics Unit . http://www.mrc-bsu.cam.ac.uk/bugs. |

[51] | Tooze JA , Grunwald GK and Jones RH (2002) Analysis of repeated measures data with clumping at zero . Statistical Methods in Medical Research, 11, 341-55 . · Zbl 1121.62674 |

[52] | Yau KK and Lee AH (2001) Zero-inflated Poisson regression with random effects to evaluate an occupational injury prevention programme . Statistics in Medicine, 20, 2907-20 . |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.