×

A flexible multivariate model for high-dimensional correlated count data. (English) Zbl 1465.62089

Summary: We propose a flexible multivariate stochastic model for over-dispersed count data. Our methodology is built upon mixed Poisson random vectors \((Y_1,\dots,Y_d)\), where the \(\{Y_i\}\) are conditionally independent Poisson random variables. The stochastic rates of the \(\{Y_i\}\) are multivariate distributions with arbitrary non-negative margins linked by a copula function. We present basic properties of these mixed Poisson multivariate distributions and provide several examples. A particular case with geometric and negative binomial marginal distributions is studied in detail. We illustrate an application of our model by conducting a high-dimensional simulation motivated by RNA-sequencing data.

MSC:

62H05 Characterization and structure theory for multivariate probability distributions; copulas
62H10 Multivariate distribution of statistics
62H30 Classification and discrimination; cluster analysis (statistical aspects)
62P10 Applications of statistics to biology and medical sciences; meta analysis

Software:

GenOrd
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Barbiero, A.; Ferrari, P. A., An R package for the simulation of correlated discrete variables, Comm. Statist. Simul. Comput, 46, 7, 5123-5140 (2017) · Zbl 1377.65011 · doi:10.1080/03610918.2016.1146758
[2] Chen, H., Initialization for NORTA: Generation of random vectors with specified marginals and correlations, INFORMS J. Comput, 13, 4, 257-360 (2001) · Zbl 1238.65006 · doi:10.1287/ijoc.13.4.312.9736
[3] Clemen, R. T.; Reilly, T., Correlations and copulas for decision and risk analysis, Manag. Sci., 45, 208-224 (1999) · Zbl 1231.91166 · doi:10.1287/mnsc.45.2.208
[4] Demitras, H.; Hedeker, D., A practical way for computing approximate lower and upper correlation bounds, Amer. Statist., 65, 2, 104-109 (2011) · Zbl 1297.97031 · doi:10.1198/tast.2011.10090
[5] Johnson, N.; Kotz, S.; Balakrishnan, N., Discrete Multivariate Distributions (1997), New York: Wiley, New York · Zbl 0868.62048
[6] Karlis, D.; Xekalaki, E., Mixed Poisson distributions, Intern. Statist. Rev., 73, 1, 35-58 (2005) · Zbl 1104.62010 · doi:10.1111/j.1751-5823.2005.tb00250.x
[7] Kozubowski, T. J.; Podgórski, P., Distribution properties of the negative binomial Lévy process, Probab. Math. Statist., 29, 43-71 (2009) · Zbl 1170.60021
[8] Madsen, L.; Birkes, D., Simulating dependent discrete data, J. Stat. Comput. Simul., 83, 4, 677-691 (2013) · Zbl 1431.62250 · doi:10.1080/00949655.2011.632774
[9] Madsen, L.; Dalthorp, D., Simulating correlated count data, Environ. Ecol. Stat., 14, 2, 129-148 (2007) · doi:10.1007/s10651-007-0008-1
[10] Nelsen, R. B.: An Introduction to Copulas (2006). · Zbl 1152.62030
[11] Nikoloulopoulos, A. K., Copula-based models for multivariate discrete response data, Copulae in Mathematical and Quantitative Finance, 231-249, Lect. Notes Stat., 213 (2013), Heidelberg: Springer, Heidelberg · Zbl 06210183
[12] Nikoloulopoulos, A. K.; Karlis, D., Modeling multivariate count data using copulas, Comm. Statist. Sim. Comput., 39, 1, 172-187 (2009) · Zbl 1183.62100 · doi:10.1080/03610910903391262
[13] Schissler, A. G.; Piegorsch, W. W.; Lussier, Y. A., Testing for differentially expressed genetic pathways with single-subject N-of-1 data in the presence of inter-gene correlation, Stat. Methods Med. Res., 27, 12, 3797-3813 (2018) · doi:10.1177/0962280217712271
[14] Solomon, D. L.; Roberts, H.; Thompson, M., The spatial distribution of cabbage butterfly eggs, Life Science Models Vol. 4 (1983), New York: Springer-Verlag, New York
[15] Song, W. T., Hsiao, L. -C.: Generation of autocorrelated random variables with a specified marginal distribution. In: Proceedings of 1993 Winter Simulation Conference - (WSC ’93), pp. 374-377, Los Angeles (1993). doi:10.1109/WSC.1993.718074.
[16] Xiao, Q., Generating correlated random vector involving discrete variables, Comm. Statist. Theory Methods, 46, 4, 1594-1605 (2017) · Zbl 1364.65024 · doi:10.1080/03610926.2015.1024860
[17] Xiao, Q.; Zhou, S., Matching a correlation coefficient by a Gaussian copula, Comm. Statist. Theory Methods, 48, 7, 1728-1747 (2019) · Zbl 07530847 · doi:10.1080/03610926.2018.1439962
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.