zbMATH — the first resource for mathematics

Valid post-selection inference. (English) Zbl 1267.62080
Summary: It is common practice in statistical data analysis to perform data-driven variable selection and derive statistical inference from the resulting model. Such inference enjoys none of the guarantees that classical statistical theory provides for tests and confidence intervals when the model has been chosen a priori. We propose to produce valid “post-selection inference” by reducing the problem to one of simultaneous inference and hence suitably widening conventional confidence and retention intervals. Simultaneity is required for all linear functions that arise as coefficient estimates in all submodels. By purchasing “simultaneity insurance” for all possible submodels, the resulting post-selection inference is rendered universally valid under all possible model selection procedures. This inference is therefore generally conservative for particular selection procedures, but it is always less conservative than the full H. Scheffé protection [The analysis of variance. (1959; Zbl 0086.34603)]. Importantly it does not depend on the truth of the selected submodel, and hence it produces valid inference even in wrong models. We describe the structure of the simultaneous inference problem and give some asymptotic results.

MSC:
 62J05 Linear regression; mixed models 62J10 Analysis of variance and covariance (ANOVA) 62J15 Paired and multiple comparisons; multiple testing 62H12 Estimation in multivariate analysis
ElemStatLearn
Full Text:
References:
 [1] Angrist, J. D. and Pischke, J. S. (2009). Mostly Harmless Econometrics . Princeton Univ. Press, Princeton. · Zbl 1159.62090 [2] Bahadur, R. R. (1966). A note on quantiles in large samples. Ann. Math. Statist. 37 577-580. · Zbl 0147.18805 · doi:10.1214/aoms/1177699450 [3] Berk, R., Brown, L., Buja, A., Zhang, K. and Zhao, L. (2013). Supplement to “Valid post-selection inference.” . · Zbl 1267.62080 · dx.doi.org [4] Brown, L. (1967). The conditional level of Student’s $$t$$ test. Ann. Math. Statist. 38 1068-1071. · Zbl 0171.16703 · doi:10.1214/aoms/1177698776 [5] Buehler, R. J. and Feddersen, A. P. (1963). Note on a conditional property of Student’s $$t$$. Ann. Math. Statist. 34 1098-1100. · Zbl 0124.10101 · doi:10.1214/aoms/1177704034 [6] Claeskens, G. and Hjort, N. L. (2003). The focused information criterion (with discussion). J. Amer. Statist. Assoc. 98 900-945. · Zbl 1045.62003 · doi:10.1198/016214503000000819 [7] Dijkstra, T. K. and Veldkamp, J. H. (1988). Data-driven selection of regressors and the bootstrap. In On Model Uncertainty and Its Statistical Implications (T. K. Dijkstra, ed.) 17-38. Springer, Berlin. · Zbl 1114.62303 [8] Hall, P. and Carroll, R. J. (1989). Variance function estimation in regression: The effect of estimating the mean. J. R. Stat. Soc. Ser. B Stat. Methodol. 51 3-14. · Zbl 0672.62053 [9] Hastie, T., Tibshirani, R. and Friedman, J. (2009). The Elements of Statistical Learning : Data Mining , Inference , and Prediction , 2nd ed. Springer, New York. · Zbl 1273.62005 [10] Hjort, N. L. and Claeskens, G. (2003). Frequentist model average estimators. J. Amer. Statist. Assoc. 98 879-899. · Zbl 1047.62003 · doi:10.1198/016214503000000828 [11] Ioannidis, J. P. A. (2005). Why most published research findings are false. PLoS Med. 2 e124. . · dx.doi.org [12] Kabaila, P. (1998). Valid confidence intervals in regression after variable selection. Econometric Theory 14 463-482. · Zbl 04547506 · doi:10.1017/S0266466698144031 [13] Kabaila, P. (2009). The coverage properties of confidence regions after model selection. International Statistical Review 77 405-414. [14] Kabaila, P. and Leeb, H. (2006). On the large-sample minimal coverage probability of confidence intervals after model selection. J. Amer. Statist. Assoc. 101 619-629. · Zbl 1119.62322 · doi:10.1198/016214505000001140 · miranda.asa.catchword.org [15] Leeb, H. (2006). The distribution of a linear predictor after model selection: Unconditional finite-sample distributions and asymptotic approximations. In Optimality. Institute of Mathematical Statistics Lecture Notes-Monograph Series 49 291-311. IMS, Beachwood, OH. · Zbl 1268.62064 · doi:10.1214/074921706000000518 [16] Leeb, H. and Pötscher, B. M. (2003). The finite-sample distribution of post-model-selection estimators and uniform versus nonuniform approximations. Econometric Theory 19 100-142. · Zbl 1032.62011 · doi:10.1017/S0266466603191050 [17] Leeb, H. and Pötscher, B. M. (2005). Model selection and inference: Facts and fiction. Econometric Theory 21 21-59. · Zbl 1085.62004 · doi:10.1017/S0266466605050036 [18] Leeb, H. and Pötscher, B. M. (2006a). Performance limits for estimators of the risk or distribution of shrinkage-type estimators, and some general lower risk-bound results. Econometric Theory 22 69-97. · Zbl 1083.62060 · doi:10.1017/S0266466606060038 [19] Leeb, H. and Pötscher, B. M. (2006b). Can one estimate the conditional distribution of post-model-selection estimators? Ann. Statist. 34 2554-2591. · Zbl 1106.62029 · doi:10.1214/009053606000000821 [20] Leeb, H. and Pötscher, B. M. (2008a). Model selection. In The Handbook of Financial Time Series (T. G. Anderson, R. A. Davis, J. P. Kreiss and T. Mikosch, eds.) 785-821. Springer, New York. [21] Leeb, H. and Pötscher, B. M. (2008b). Can one estimate the unconditional distribution of post-model-selection estimators? Econometric Theory 24 338-376. · Zbl 1284.62152 [22] Leeb, H. and Pötscher, B. M. (2008c). Sparse estimators and the oracle property, or the return of Hodges’ estimator. J. Econometrics 142 201-211. · Zbl 1418.62272 · doi:10.1016/j.jeconom.2007.05.017 [23] Moore, D. S. and McCabe, G. P. (2003). Introduction to the Practice of Statistics , 4th ed. Freeman, New York. · Zbl 0701.62002 [24] Olshen, R. A. (1973). The conditional level of the $$F$$-test. J. Amer. Statist. Assoc. 68 692-698. · Zbl 0271.62068 · doi:10.2307/2284800 [25] Pötscher, B. M. (1991). Effects of model selection on inference. Econometric Theory 7 163-185. · Zbl 04504752 · doi:10.1017/S0266466600004382 [26] Pötscher, B. M. (2006). The distribution of model averaging estimators and an impossibility result regarding its estimation. In Time Series and Related Topics. Institute of Mathematical Statistics Lecture Notes-Monograph Series 52 113-129. IMS, Beachwood, OH. · Zbl 1268.62066 [27] Pötscher, B. M. and Leeb, H. (2009). On the distribution of penalized maximum likelihood estimators: The LASSO, SCAD, and thresholding. J. Multivariate Anal. 100 2065-2082. · Zbl 1170.62046 · doi:10.1016/j.jmva.2009.06.010 [28] Pötscher, B. M. and Schneider, U. (2009). On the distribution of the adaptive LASSO estimator. J. Statist. Plann. Inference 139 2775-2790. · Zbl 1162.62063 · doi:10.1016/j.jspi.2009.01.003 [29] Pötscher, B. M. and Schneider, U. (2010). Confidence sets based on penalized maximum likelihood estimators in Gaussian regression. Electron. J. Stat. 4 334-360. · Zbl 1329.62156 · doi:10.1214/09-EJS523 [30] Pötscher, B. M. and Schneider, U. (2011). Distributional results for thresholding estimators in high-dimensional Gaussian regression models. Electron. J. Stat. 5 1876-1934. · Zbl 1271.62149 · doi:10.1214/11-EJS659 [31] Scheffé, H. (1959). The Analysis of Variance . Wiley, New York. · Zbl 0086.34603 [32] Sen, P. K. (1979). Asymptotic properties of maximum likelihood estimators based on conditional specification. Ann. Statist. 7 1019-1033. · Zbl 0413.62020 · doi:10.1214/aos/1176344785 [33] Sen, P. K. and Saleh, A. K. M. E. (1987). On preliminary test and shrinkage $$M$$-estimation in linear models. Ann. Statist. 15 1580-1592. · Zbl 0639.62046 · doi:10.1214/aos/1176350611 [34] Wyner, A. D. (1967). Random packings and coverings of the unit $$n$$-sphere. Bell System Tech. J. 46 2111-2118. · Zbl 0262.60002
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.