Correlation when data are missing.

*(English)*Zbl 1198.62174Summary: Variable correlation is important for many operations research models. Many inventory, revenue management, and queuing models presume uncorrelated demand between products, market segments, or time periods. The specific model applied, or the resulting policies of a model, can differ drastically depending on variable correlations. Having missing data are a common problem for real world applications of operations research models.

This work is at the junction of the two topics of correlation and missing data. We propose a test of independence between two variables when data are missing. The typical method for determining correlation with missing data ignores all data pairs in which one point is missing. The test presented here incorporates all data. The test can be applied when both variables are continuous, when both are discrete, or when one variable is discrete and the other is continuous. The test makes no assumptions about the distribution of the two variables, and thus it can be used to extend applications of nonparametric rank tests, such as Spearman’s rank correlation, to the case where data are missing. An example is shown where failure to incorporate the incomplete data yields incorrect policies.

This work is at the junction of the two topics of correlation and missing data. We propose a test of independence between two variables when data are missing. The typical method for determining correlation with missing data ignores all data pairs in which one point is missing. The test presented here incorporates all data. The test can be applied when both variables are continuous, when both are discrete, or when one variable is discrete and the other is continuous. The test makes no assumptions about the distribution of the two variables, and thus it can be used to extend applications of nonparametric rank tests, such as Spearman’s rank correlation, to the case where data are missing. An example is shown where failure to incorporate the incomplete data yields incorrect policies.