×

zbMATH — the first resource for mathematics

Nonlinear confounding in high-dimensional regression. (English) Zbl 0873.62071
Summary: It is not uncommon to find nonlinear patterns in the scatterplots of regressor variables. But how such findings affect standard regression analysis remains largely unexplored. This article offers a theory on nonlinear confounding, a term for describing the situation where a certain nonlinear relationship in regressors leads to difficulties in modeling and related analysis of the data. The theory begins with a measure of nonlinearity between two regressor variables. It is then used to assess nonlinearity between any two projections from the high-dimensional regressor and a method of finding most nonlinear projections is given. Nonlinear confounding is addressed by taking a fresh new look at fundamental issues such as the validity of prediction and inference, diagnostics, regression surface approximation, model uncertainty and Fisher information loss.

MSC:
62J20 Diagnostics, and linear inference and regression
62J02 General nonlinear regression
62J99 Linear inference, regression
Software:
LISP-STAT
PDF BibTeX XML Cite
Full Text: DOI
References:
[1] ALDRIN, M., BøLVIKEN, E. and SCHWEDER, T. 1993. Projection pursuit regression for moderate non-linearities. Comput. Statist. Data Anal. 16 379 403. Z. · Zbl 0937.62535 · doi:10.1016/0167-9473(93)90156-N
[2] BICKEL, P. J., KLASSEN, C. A. J., RITOV, Y. and WELLNER, J. A. 1992. Efficient and Adaptive Estimation for Semiparametric Models. Johns Hopkins Univ. Press.
[3] BOX, G. E. P. and COX, D. R. 1964. An analysis of transformations. J. Roy. Statist. Soc. Ser. B 26 211 246. Z. JSTOR: · Zbl 0156.40104 · links.jstor.org
[4] BOX, G. E. P. and DRAPER, N. 1987. Empirical Model-Building and Response Surfaces. Wiley, New York. Z. · Zbl 0614.62104
[5] BREIMAN, L. and FRIEDMAN, J. 1985. Estimating optimal transformations for multiple regression and correlation. J. Amer. Statist. Assoc. 80 580 597. Z. JSTOR: · Zbl 0594.62044 · doi:10.2307/2288473 · links.jstor.org
[6] BRILLINGER, D. R. 1983. A generalized linear model with “Gaussian” regressor variables. In Z A Festschrift for Erich L. Lehmann P. J. Bickel, K. A. Doksum and J. L. Hodges, Jr.,. eds. 97 114. Wadsworth, Z. · Zbl 0519.62050
[7] BRILLINGER, D. R. 1991. Discussion of “Sliced inverse regression.” J. Amer. Statist. Assoc. 86 333 333. Z. · Zbl 1353.62035
[8] BUJA, A., HASTIE, T. and TIBSHIRANI, R. 1989. Linear smoothers and additive models. Ann. Statist. 17 453 555.Z. · Zbl 0689.62029 · doi:10.1214/aos/1176347115
[9] CARROLL, R. J. and LI, K. C. 1992. Measurement error regression with unknown link: dimension reduction and data visualization. J. Amer. Statist. Assoc. 87 1040 1050. Z. JSTOR: · Zbl 0765.62002 · doi:10.2307/2290641 · links.jstor.org
[10] CHEN, H. 1991. Estimation of a projection-pursuit ty pe regression model. Ann. Statist. 19 142 157. Z. · Zbl 0736.62055 · doi:10.1214/aos/1176347974
[11] COOK, R. D. 1993. Exploring partial residual plots. Technometrics 35 351 362. Z. JSTOR: · Zbl 0800.62018 · doi:10.2307/1270269 · links.jstor.org
[12] COOK, R. D. 1994. On the interpretation of regression plots. J. Amer. Statist. Assoc. 89 177 189. Z. JSTOR: · Zbl 0791.62066 · doi:10.2307/2291214 · links.jstor.org
[13] COOK, R. D. and NACHTSHEIM, C. J. 1994. Re-weighting to achieve elliptically contoured covariates in regression. J. Amer. Statist. Assoc. 89 592 599. Z. · Zbl 0799.62078 · doi:10.2307/2290862
[14] COOK, R. D. and WEISBERG, S. 1991. Discussion of “Sliced inverse regression” by K. C. Li. J. Amer. Statist. Assoc. 86 328 332. Z. · Zbl 1353.62037
[15] COOK, R. D. and WEISBERG, S. 1994. An Introduction to Regression Graphics. Wiley, New York. Z. · Zbl 0925.62287
[16] COX, D. R. and SNELL, E. J. 1981. Applied Statistics: Principles and Examples. Chapman & Hall, New York. Z. · Zbl 0612.62002
[17] DUAN, N. and LI, K. C. 1991a. Slicing regression: a link-free regression method. Ann. Statist. 19 505 530. Z. · Zbl 0738.62070 · doi:10.1214/aos/1176348109
[18] DUAN, N. and LI, K. C. 1991b. A bias bound for applying linear regression to a general linear model. Statist. Sinica 1 127 136. Z. · Zbl 0824.62057
[19] FRIEDMAN, J. and STUETZLE, W. 1981. Projection pursuit regression. J. Amer. Statist. Assoc. 76 817 823. Z. JSTOR: · doi:10.2307/2287576 · links.jstor.org
[20] GU, C. 1992. Diagnostics for nonparametric regression models with additive terms. J. Amer. Statist. Assoc. 87 1051 1058. Z.
[21] HALL, P. 1989. On projection pursuit regression. Ann. Statist. 17 573 588. Z. · Zbl 0698.62041 · doi:10.1214/aos/1176347126
[22] HALL, P. and LI, K. C. 1993. On almost linearity of low dimensional projections from high dimensional data. Ann. Statist. 21 867 889. Z. · Zbl 0782.62065 · doi:10.1214/aos/1176349155
[23] HARDLE, W., HALL, P. and ICHIMURA, H. 1993. Optimal smoothing in single-index models. Ann. \" Statist. 21 157 178. Z. · Zbl 0770.62049 · doi:10.1214/aos/1176349020
[24] HARDLE, W. and STOKER, T. 1989. Investigating smooth multiple regression by the method of äverage derivatives. J. Amer. Statist. Assoc. 84 986 995. Z. JSTOR: · Zbl 0703.62052 · doi:10.2307/2290074 · links.jstor.org
[25] HARRISON, D. and RUBINFELD, D. L. 1978. Hedonic housing prices and the demand for clean air. J. Environmental Economics and Management 5 81 102. Z. · Zbl 0375.90023 · doi:10.1016/0095-0696(78)90006-2
[26] HSING, T. and CARROLL, R. J. 1992. Asy mptotic properties of sliced inverse regression. Ann. Statist. 20 1040 1061. Z. · Zbl 0821.62019 · doi:10.1214/aos/1176348669
[27] LI, K. C. 1990. Data-visualization with SIR: a transformation-based projection pursuit method. Technical report. Z. Z.
[28] LI, K. C. 1991. Sliced inverse regression for dimension reduction with discussion. J. Amer. Statist. Assoc. 86 316 342. Z. JSTOR: · Zbl 0742.62044 · doi:10.2307/2290563 · links.jstor.org
[29] LI, K. C. 1992a. Uncertainty analysis for mathematical models with SIR. In Probability and Z. Statistics J. Ze-Pei, Y. Shi-Jian, C. Ping and W. Rong, eds. 138 162. World Scientific, Singapore. Z.
[30] LI, K. C. 1992b. On principal Hessian directions for data visualization and dimension reduction: another application of Stein’s lemma. J. Amer. Statist. Assoc. 87 1025 1039. JSTOR: · Zbl 0765.62003 · doi:10.2307/2290640 · links.jstor.org
[31] LI, K. C. and DUAN, N. 1989. Regression analysis under link violation. Ann. Statist. 17 1009 1052. Z. · Zbl 0753.62041 · doi:10.1214/aos/1176347254
[32] NELDER, J. A. and WEDDERBURN, R. W. M. 1972. Generalized linear models. J. Roy. Statist. Soc. Ser. A 135 370 384. Z.
[33] SAMAROV, A. 1993. Exploring regression structure using functional estimation. J. Amer. Statist. Assoc. 88 836 847. Z. JSTOR: · Zbl 0790.62035 · doi:10.2307/2290772 · links.jstor.org
[34] TIERNEY, L. 1990. LISP-STAT: An Object-Oriented Environment for Statistical Computing and Dy namic Graphics. Wiley, New York. Z. · Zbl 0747.62007
[35] WHITE, H. 1989. Some asy mptotic results for learning in single hidden-lay er feed-forward network models. J. Amer. Statist. Assoc. 84 1003 1013. JSTOR: · Zbl 0721.62081 · doi:10.2307/2290076 · links.jstor.org
[36] LOS ANGELES, CALIFORNIA 90024 E-MAIL: kcli@math.ucla.edu
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.