# zbMATH — the first resource for mathematics

Models as approximations. II. A model-free theory of parametric regression. (English) Zbl 1440.62021
Summary: We develop a model-free theory of general types of parametric regression for i.i.d. observations. The theory replaces the parameters of parametric models with statistical functionals, to be called “regression functionals”, defined on large nonparametric classes of joint $$x$$-$$y$$ distributions, without assuming a correct model. Parametric models are reduced to heuristics to suggest plausible objective functions. An example of a regression functional is the vector of slopes of linear equations fitted by OLS to largely arbitrary $$x$$-$$y$$ distributions, without assuming a linear model (see Part I). More generally, regression functionals can be defined by minimizing objective functions, solving estimating equations, or with ad hoc constructions. In this framework, it is possible to achieve the following: (1) define a notion of “well-specification” for regression functionals that replaces the notion of correct specification of models, (2) propose a well-specification diagnostic for regression functionals based on reweighting distributions and data, (3) decompose sampling variability of regression functionals into two sources, one due to the conditional response distribution and another due to the regressor distribution interacting with misspecification, both of order $$N^{-1/2}$$, (4) exhibit plug-in/sandwich estimators of standard error as limit cases of $$x$$-$$y$$ bootstrap estimators, and (5) provide theoretical heuristics to indicate that $$x$$-$$y$$ bootstrap standard errors may generally be preferred over sandwich estimators.
For Part I, see [Zbl 1440.62020].

##### MSC:
 62A01 Foundations and philosophical topics in statistics 62J05 Linear regression; mixed models 62F35 Robustness and adaptive procedures (parametric inference) 62P20 Applications of statistics to economics
bootstrap; R
Full Text:
##### References:
  Berk, R., Kriegler, B. and Ylvisaker, D. (2008). Counting the homeless in Los Angeles county. In Probability and Statistics: Essays in Honor of David A. Freedman (D. Nolan and S. Speed, eds.). Inst. Math. Stat. (IMS) Collect. 2 127-141. IMS, Beachwood, OH. · Zbl 1166.62381  Berk, R., Brown, L., Buja, A., Zhang, K. and Zhao, L. (2013). Valid post-selection inference. Ann. Statist. 41 802-837. · Zbl 1267.62080  Bickel, P. J., Götze, F. and van Zwet, W. R. (1997). Resampling fewer than $$n$$ observations: Gains, losses, and remedies for losses. Statist. Sinica 7 1-31. · Zbl 0927.62043  Breiman, L. (1996). Bagging predictors. Mach. Learn. 24 123-140. · Zbl 0858.68080  Buja, A. and Stuetzle, W. (2001,2016). Smoothing effects of bagging: Von Mises expansions of bagged statistical functionals. Available at arXiv:1612.02528.  Buja, A., Brown, L., Kuchibhotla, A. K., Berk, R., George, E. and Zhao, L. (2019). Supplement to “Models as Approximations II: A Model-Free Theory of Parametric Regression.” 10.1214/18-STS694SUPP.  Efron, B. and Tibshirani, R. J. (1993). An Introduction to the Bootstrap. Monographs on Statistics and Applied Probability 57. CRC Press, New York. · Zbl 0835.62038  Gelman, A. and Park, D. K. (2009). Splitting a predictor at the upper quarter or third and the lower quarter or third. Amer. Statist. 63 1-8.  Hall, P. (1992). The Bootstrap and Edgeworth Expansion. Springer Series in Statistics. Springer, New York. · Zbl 0744.62026  Hampel, F. R., Ronchetti, E. M., Rousseeuw, P. J. and Stahel, W. A. (1986). Robust Statistics: The Approach Based on Influence Functions. Wiley Series in Probability and Mathematical Statistics: Probability and Mathematical Statistics. Wiley, New York. · Zbl 0593.62027  Hansen, L. P. (1982). Large sample properties of generalized method of moments estimators. Econometrica 50 1029-1054. · Zbl 0502.62098  Hastie, T. J. and Tibshirani, R. J. (1990). Generalized Additive Models. Monographs on Statistics and Applied Probability 43. CRC Press, London.  Hausman, J. A. (1978). Specification tests in econometrics. Econometrica 46 1251-1271. · Zbl 0397.62043  Huber, P. J. (1964). Robust estimation of a location parameter. Ann. Math. Stat. 35 73-101. · Zbl 0136.39805  Huber, P. J. (1967). The behavior of maximum likelihood estimates under nonstandard conditions. In Proc. Fifth Berkeley Sympos. Math. Statist. and Probability (Berkeley, Calif., 1965/66), Vol. I: Statistics 221-233. Univ. California Press, Berkeley, CA.  Kuchibhotla, A. K., Brown, L. D. and Buja, A. (2018). Model-free study of ordinary least squares linear regression. Available at arXiv:1809.10538.  Liang, K. Y. and Zeger, S. L. (1986). Longitudinal data analysis using generalized linear models. Biometrika 73 13-22. · Zbl 0595.62110  Pearl, J. (2009). Causality: Models, Reasoning, and Inference, 2nd ed. Cambridge Univ. Press, Cambridge. · Zbl 1188.68291  Peters, J., Bühlmann, P. and Meinshausen, N. (2016). Causal inference by using invariant prediction: Identification and confidence intervals. J. R. Stat. Soc. Ser. B. Stat. Methodol. 78 947-1012. · Zbl 1414.62297  Politis, D. N. and Romano, J. P. (1994). Large sample confidence regions based on subsamples under minimal assumptions. Ann. Statist. 22 2031-2050. · Zbl 0828.62044  R Development Core Team (2008). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. Available at http://www.R-project.org.  Rieder, H. (1994). Robust Asymptotic Statistics. Springer Series in Statistics. Springer, New York. · Zbl 0927.62050  Tukey, J. W. (1962). The future of data analysis. Ann. Math. Stat. 33 1-67. · Zbl 0107.36401  White, H. (1980). Using least squares to approximate unknown regression functions. Internat. Econom. Rev. 21 149-170. · Zbl 0444.62119  White, H. · Zbl 0478.62088
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.