×

zbMATH — the first resource for mathematics

A bias bound for least squares linear regression. (English) Zbl 0824.62057
Summary: Consider a general linear model \(y= g(\alpha+ \beta {\mathbf x})+ \varepsilon\), where the link function \(g\) is arbitrary and unknown. The maximal component of \((\alpha, \beta)\) that can be identified is the direction of \(\beta\), which measures the substitutibility of the components of \({\mathbf x}\). If \(\zeta (\beta {\mathbf x})= E({\mathbf x}\mid \beta {\mathbf x})\) is linear in \(\beta {\mathbf x}\), the least squares linear regression of \(y\) on \({\mathbf x}\) gives a consistent estimate for the direction of \(\beta\), despite possible nonlinearity in the link function. If \(\zeta (\beta {\mathbf x})\) is nonlinear, the linear regression might be inconsistent for the direction of \(\beta\).
We establish a bound for the asymptotic bias, which is determined from the nonlinearity in \(\zeta( \beta {\mathbf x})\), and the multiple correlation coefficient \(R^ 2\) for the least squares linear regression of \(y\) on \({\mathbf x}\). According to the bias bound, the linear regression is nearly consistent for the direction of \(\beta\), despite possible nonlinearity in the link function, provided that the nonlinearity in \(\zeta (\beta {\mathbf x})\) is small compared to \(R^ 2\). Our measure of nonlinearity in \(\zeta (\beta {\mathbf x})\) is analogous to the maximal curvature studied by D. R. Cox and N. J. H. Small [Biometrika 65, 263-272 (1978; Zbl 0386.62041)]. The bias bound is tight; we give the construction for the least favorable models which achieve the bias bound. The theory is applied to a special case for an illustration.

MSC:
62J05 Linear regression; mixed models
62J12 Generalized linear models (logistic models)
PDF BibTeX XML Cite