Correlation curves: Measures of association as functions of covariate values.

*(English)*Zbl 0817.62025Summary: For experiments where the strength of association between a response variable \(Y\) and a covariate \(X\) is different over different regions of values for the covariate \(X\), we propose local nonparametric dependence functions which measure the strength of association between \(Y\) and \(X\) as a function of \(X = x\). Our dependence functions are extensions of Galton’s idea of strength of co-relation from the bivariate normal case to the nonparametric case.

In particular, a dependence function is obtained by expressing the usual Galton-Pearson correlation coefficient in terms of the regression line slope \(\beta\) and the residual variance \(\sigma^ 2\) and then replacing \(\beta\) and \(\sigma^ 2\) by a nonparametric regression slope \(\beta(x)\) and a nonparametric residual variance \(\sigma^ 2 (x) = \text{var} (Y | x)\), respectively. Our local dependence functions are standardized nonparametric regression curves which provide universal scale-free measures of the strength of the relationship between variables in nonlinear models. They share most of the properties of the correlation coefficient and they reduce to the usual correlation coefficient in the bivariate normal case. For this reason we call them correlation curves. We show that, in a certain sense, they quantify E. L. Lehman’s [Ann. Math. Stat. 37, 1137-1153 (1966; Zbl 0146.406)] notion of regression dependence. Finally, the correlation curve concept is illustrated using data from a study of the relationship between cholesterol level \(x\) and triglyceride concentrations \(y\) of heart patients.

In particular, a dependence function is obtained by expressing the usual Galton-Pearson correlation coefficient in terms of the regression line slope \(\beta\) and the residual variance \(\sigma^ 2\) and then replacing \(\beta\) and \(\sigma^ 2\) by a nonparametric regression slope \(\beta(x)\) and a nonparametric residual variance \(\sigma^ 2 (x) = \text{var} (Y | x)\), respectively. Our local dependence functions are standardized nonparametric regression curves which provide universal scale-free measures of the strength of the relationship between variables in nonlinear models. They share most of the properties of the correlation coefficient and they reduce to the usual correlation coefficient in the bivariate normal case. For this reason we call them correlation curves. We show that, in a certain sense, they quantify E. L. Lehman’s [Ann. Math. Stat. 37, 1137-1153 (1966; Zbl 0146.406)] notion of regression dependence. Finally, the correlation curve concept is illustrated using data from a study of the relationship between cholesterol level \(x\) and triglyceride concentrations \(y\) of heart patients.

##### MSC:

62G07 | Density estimation |

62H20 | Measures of association (correlation, canonical correlation, etc.) |

62J02 | General nonlinear regression |