×

Calibration tests for multivariate Gaussian forecasts. (English) Zbl 1352.62087

Summary: Forecasts by nature should take the form of probabilistic distributions. Calibration, the statistical consistency of forecast distributions and observations, is a central property of good probabilistic forecasts. Calibration of univariate forecasts has been widely discussed, and significance tests are commonly used to investigate whether a prediction model is miscalibrated. However, calibration tests for multivariate forecasts are rare. In this paper, we propose calibration tests for multivariate Gaussian forecasts based on two types of the Dawid-Sebastiani score (DSS): the multivariate DSS (mDSS) and the individual DSS (iDSS). Analytic results and simulation studies show that the tests have sufficient power to detect miscalibrated forecasts with incorrect mean or incorrect variance. But for forecasts with incorrect correlation coefficients, only the tests based on mDSS are sensitive to miscalibration. As an illustration, we apply the methodology to weekly data on Norovirus disease incidence among males and females in Germany, in 2011–2014. The results further show that tests for multivariate forecasts are useful tools and superior to univariate calibration tests for correlated multivariate forecasts.

MSC:

62H15 Hypothesis testing in multivariate analysis
62E20 Asymptotic distribution theory in statistics
62P10 Applications of statistics to biology and medical sciences; meta analysis
62J05 Linear regression; mixed models
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Bao, Y.; Lee, T. H.; Saltoğlu, B., Comparing density forecast models, J. Forecast., 26, 203-225 (2007)
[2] Bartko, J. J., Approximating the negative binomial, Technometrics, 8, 345-350 (1966)
[3] Bellman, R. E., Dynamic Programming (2003), Dover Publications, Incorporated
[4] Benjamini, Y.; Hochberg, Y., Controlling the false discovery rate: a practical and powerful approach to multiple testing, J. R. Stat. Soc. Ser. B Stat. Methodol., 57, 289-300 (1995) · Zbl 0809.62014
[5] Bernard, H.; Werber, D.; Höhle, M., Estimating the under-reporting of Norovirus illness in Germany utilizing enhanced awareness of diarrhoea during a large outbreak of Shiga toxin-producing E. coli O104:H4 in 2011-a time series analysis, BMC Infect. Dis., 14, 116 (2014)
[6] Brier, G. W., Verification of forecasts expressed in terms of probability, Mon. Weather Rev., 78, 1-3 (1950)
[7] Brown, M. B., 400: A method for combining non-independent, one-sided tests of significance, Biometrics, 31, 987-992 (1975) · Zbl 0318.62030
[9] Clements, M. P., Evaluating Econometric Forecasts of Economic and Financial Variables (2005), Palgrave Macmillan
[10] Cox, D. R., Two further applications of a model for binary regression, Biometrika, 45, 562-565 (1958) · Zbl 0085.13715
[11] David, H. A.; Nagaraja, H. N., Order statistics, (Encyclopedia of Statistical Sciences (2006), John Wiley & Sons, Inc.) · Zbl 0905.62055
[12] Dawid, A. P., Statistical theory: the prequential appoach, J. Roy. Statist. Soc. Ser. A, 147, 278-292 (1984) · Zbl 0557.62080
[13] Dawid, A. P.; Sebastiani, P., Coherent dispersion criteria for optimal experimental design, Ann. Statist., 27, 65-81 (1999) · Zbl 0948.62057
[14] Diebold, F. X.; Gunther, T. A.; Tay, A. S., Evaluating density forecasts with applications to financial risk management, Internat. Econom. Rev., 39, 863-883 (1998)
[15] Diebold, F. X.; Hahn, J.; Tay, A. S., Multivariate density forecast evaluation and calibration in financial risk management: high-frequency returns on foreign exchange, Rev. Econ. Stat., 81, 661-673 (1999)
[16] Diggle, P. J.; Heagerty, P. J.; Liang, K. Y.; Zeger, S. L., (Analysis of Longitudinal Data. Analysis of Longitudinal Data, Oxford Statistical Science Series (2003), Oxford University Press: Oxford University Press Oxford) · Zbl 1031.62002
[17] Fisher, R. A., Statistical Methods for Research Workers (1958), Oliver & Boyd: Oliver & Boyd Edinburgh · JFM 60.1162.01
[18] Fretz, R.; Svoboda, P.; Lüthi, T.; Tanner, M.; Baumgartner, A., Outbreaks of gastroenteritis due to infections with Norovirus in Switzerland, 2001-2003, Epidemiol. Infect., 133, 429-437 (2005)
[19] Gneiting, T.; Balabdaoui, F.; Raftery, A. E., Probabilistic forecasts, calibration and sharpness, J. R. Stat. Soc. Ser. B Stat. Methodol., 69, 243-268 (2007) · Zbl 1120.62074
[20] Gneiting, T.; Katzfuss, M., Probabilistic forecasting, Annu. Rev. Stat. Appl., 1, 125-151 (2014)
[21] Gneiting, T.; Stanberry, L. I.; Grimit, E. P.; Held, L.; Johnson, N. A., Assessing probabilistic forecasts of multivariate quantities, with an application to ensemble predictions of surface winds, TEST, 17, 211-235 (2008) · Zbl 1196.62091
[22] Good, I. J., Rational decisions, J. R. Stat. Soc. Ser. B Stat. Methodol., 14, 107-114 (1952)
[23] Held, L.; Rufibach, K.; Balabdaoui, F., A score regression approach to assess calibration of continuous probabilistic predictions, Biometrics, 66, 1295-1305 (2010) · Zbl 1208.62077
[24] Hotelling, H., The generalization of Student’s ratio, Ann. Math. Statist., 2, 360-378 (1931) · Zbl 0004.26503
[25] Isserlis, L., On a formula for the product-moment coefficient of any order of a normal frequency distribution in any number of variables, Biometrika, 12, 134-139 (1918)
[26] Jolliffe, I. T.; Stephenson, D. B., Forecast Verification: a Practitioner’s Guide in Atmospheric Science (2012), John Wiley & Sons
[27] Lopman, B.; Vennema, H.; Kohli, E.; Pothier, P.; Sanchez, A.; Negredo, A.; Buesa, J.; Schreier, E.; Gray, J.; Gallimore, C., Increase in viral gastroenteritis outbreaks in Europe and epidemic spread of new Norovirus variant, Lancet, 363, 682-688 (2004)
[29] Mallik, R. K., Some properties of the uniform correlation matrix and their applications, (Wireless Communications and Networking Conference. Wireless Communications and Networking Conference, (WCNC) (2007), IEEE), 1052-1057
[30] Mason, S. J.; Galpin, J. S.; Goddard, L.; Graham, N. E.; Rajartnam, B., Conditional exceedance probabilities, Mon. Weather Rev., 135, 363-372 (2007)
[31] Meyer, S.; Held, L.; Höhle, M., Spatio-temporal analysis of epidemic phenomena using the R package surveillance, J. Stat. Softw. (2016), Preprint available at http://arxiv.org/abs/1411.0416
[32] Möller, A.; Lenkoski, A.; Thorarinsdottir, T. L., Multivariate probabilistic forecasting using ensemble Bayesian model averaging and copulas, Q. J. R. Meteorol. Soc., 139, 981-991 (2013)
[33] Paul, M.; Held, L., Predictive assessment of a non-linear random effects model for multivariate time series of infectious disease counts, Stat. Med., 30, 1118-1136 (2011)
[34] Paul, M.; Held, L.; Toschke, A., Multivariate modelling of infectious disease surveillance data, Stat. Med., 27, 6250-6267 (2008)
[35] Riebler, A.; Held, L., Projecting the future burden of cancer: Bayesian age-period-cohort analysis ready for routine use, Biom. J. (2017), (in press) · Zbl 1422.62332
[36] Scheuerer, M.; Hamill, T. M., Variogram-based proper scoring rules for probabilistic forecasts of multivariate quantities, Mon. Weather Rev., 143, 1321-1334 (2015)
[37] Sihabuddin, A.; Subanar, D. R.; Winarko, E., A second correlation method for multivariate exchange rates forecasting, Int. J. Adv. Comput. Sci. Appl., 5, 30-33 (2014)
[38] Spiegelhalter, D. J., Probabilistic prediction in patient management and clinical trials, Stat. Med., 5, 421-433 (1986)
[39] Thorarinsdottir, T. L.; Scheuerer, M.; Heinz, C., Assessing the calibration of high-dimensional ensemble forecasts using rank histograms, J. Comput. Graph. Statist., 25, 105-122 (2016)
[40] Wei, W.; Held, L., Calibration tests for count data, TEST, 23, 787-805 (2014) · Zbl 1312.62119
[41] Wu, H. M.; Fornek, M.; Schwab, K. J.; Chapin, A. R.; Gibson, K.; Schwab, E.; Spencer, C.; Henning, K., A Norovirus outbreak at a long-term-care facility: the role of environmental surface contamination, Infect. Control Hosp. Epidemiol., 26, 802-810 (2005)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.