Correlation curves as local measures of variance explained by regression.

*(English)*Zbl 0803.62035Summary: We call (a model for) an experiment heterocorrelatious if the strength of the relationship between a response variable \(Y\) and a covariate \(X\) is different in different regions of the covariate space. For such experiments we introduce a correlation curve that measures heterocorrelaticity in terms of the variance explained by regression locally at each covariate value.

More precisely, the squared correlation curve is obtained by first expressing the usual linear model “variance explained to total variance” formula in terms of the residual variance and the regression slope and then replacing these by the conditional residual variance depending on \(x\) and the slope of the conditional mean of \(Y\) given \(X=x\). The correlation curve \(\rho(x)\) satisfies the invariance properties of correlation, it reduces to the Galton-Pearson correlation \(\rho\) in linear models, it is between \(-1\) and 1, it is 0 when \(X\) and \(Y\) are independent, and it is \(\pm1\) when \(Y\) is a function of \(X\).

We introduce estimates of the correlation curve based on nearest-neighbor estimates of the (conditional) residual variance function and the (conditional) regression slope function, as well as on Th. Gasser and H.-G. Müller’s [Scand. J. Stat., Theory Appl. 11, 171-185 (1984; Zbl 0548.62028)] kernel estimates of these functions. We obtain consistency and asymptotic normality results and give simple asymptotic simultaneous confidence intervals for the correlation curve. Real data and simulated data examples are used to illustrate the local correlation procedures.

More precisely, the squared correlation curve is obtained by first expressing the usual linear model “variance explained to total variance” formula in terms of the residual variance and the regression slope and then replacing these by the conditional residual variance depending on \(x\) and the slope of the conditional mean of \(Y\) given \(X=x\). The correlation curve \(\rho(x)\) satisfies the invariance properties of correlation, it reduces to the Galton-Pearson correlation \(\rho\) in linear models, it is between \(-1\) and 1, it is 0 when \(X\) and \(Y\) are independent, and it is \(\pm1\) when \(Y\) is a function of \(X\).

We introduce estimates of the correlation curve based on nearest-neighbor estimates of the (conditional) residual variance function and the (conditional) regression slope function, as well as on Th. Gasser and H.-G. Müller’s [Scand. J. Stat., Theory Appl. 11, 171-185 (1984; Zbl 0548.62028)] kernel estimates of these functions. We obtain consistency and asymptotic normality results and give simple asymptotic simultaneous confidence intervals for the correlation curve. Real data and simulated data examples are used to illustrate the local correlation procedures.

##### MSC:

62G07 | Density estimation |

62H20 | Measures of association (correlation, canonical correlation, etc.) |