×

zbMATH — the first resource for mathematics

Can one estimate the conditional distribution of post-model-selection estimators? (English) Zbl 1106.62029
Summary: We consider the problem of estimating the conditional distribution of a post-model-selection estimator where the conditioning is on the selected model. The notion of a post-model-selection estimator here refers to the combined procedure resulting from first selecting a model (e.g., by a model selection criterion such as AIC or by a hypothesis testing procedure) and then estimating the parameters in the selected model (e.g., by least-squares or maximum likelihood), all based on the same data set.
We show that it is impossible to estimate this distribution with reasonable accuracy even asymptotically. In particular, we show that no estimator for this distribution can be uniformly consistent (not even locally). This follows as a corollary to (local) minimax lower bounds on the performance of estimators for this distribution. Similar impossibility results are also obtained for the conditional distribution of linear functions (e.g., predictors) of the post-model-selection estimator.

MSC:
62F12 Asymptotic properties of parametric estimators
62J05 Linear regression; mixed models
62C99 Statistical decision theory
62F10 Point estimation
62J07 Ridge regression; shrinkage estimators (Lasso)
PDF BibTeX XML Cite
Full Text: DOI arXiv
References:
[1] Ahmed, S. E. and Basu, A. K. (2000). Least squares, preliminary test and Stein-type estimation in general vector AR\((p)\) models. Statist. Neerlandica 54 47–66. · Zbl 0957.62072 · doi:10.1111/1467-9574.00125
[2] Bauer, P., Pötscher, B. M. and Hackl, P. (1988). Model selection by multiple test procedures. Statistics 19 39–44. · Zbl 0644.62024 · doi:10.1080/02331888808802068
[3] Danilov, D. L. and Magnus, J. R. (2004). On the harm that ignoring pre-testing can cause. J. Econometrics 122 27–46. · Zbl 1282.91257 · doi:10.1016/j.jeconom.2003.10.018
[4] Dijkstra, T. K. and Veldkamp, J. H. (1988). Data-driven selection of regressors and the bootstrap. In On Model Uncertainty and Its Statistical Implications (T. K. Dijkstra, ed.) 17–38. Springer, Berlin. · Zbl 1114.62303
[5] Dukić, V. M. and Peña, E. A. (2005). Variance estimation in a model with Gaussian submodels. J. Amer. Statist. Assoc. 100 296–309. · Zbl 1117.62321 · doi:10.1198/016214504000000818 · masetto.asa.catchword.org
[6] Freedman, D. A., Navidi, W. and Peters, S. C. (1988). On the impact of variable selection in fitting regression equations. In On Model Uncertainty and Its Statistical Implications (T. K. Dijkstra, ed.) 1–16. Springer, Berlin.
[7] Hjort, N. L. and Claeskens, G. (2003). Frequentist model average estimators. J. Amer. Statist. Assoc. 98 879–899. · Zbl 1047.62003 · doi:10.1198/016214503000000828
[8] Kabaila, P. (1995). The effect of model selection on confidence regions and prediction regions. Econometric Theory 11 537–549. JSTOR: · links.jstor.org
[9] Kapetanios, G. (2001). Incorporating lag order selection uncertainty in parameter inference for AR models. Econom. Lett. 72 137–144. · Zbl 1030.62068 · doi:10.1016/S0165-1765(01)00433-5
[10] Leeb, H. (2005). The distribution of a linear predictor after model selection: Conditional finite-sample distributions and asymptotic approximations. J. Statist. Plann. Inference 134 64–89. · Zbl 1066.62071 · doi:10.1016/j.jspi.2004.04.005
[11] Leeb, H. (2006). The distribution of a linear predictor after model selection: Unconditional finite-sample distributions and asymptotic approximations. In Optimality : The Second Erich L. Lehmann Symposium (J. Rojo, ed.) 291–311. IMS, Beachwood, OH. · Zbl 1268.62064 · doi:10.1214/074921706000000518
[12] Leeb, H. and Pötscher, B. M. (2003). The finite-sample distribution of post-model-selection estimators and uniform versus nonuniform approximations. Econometric Theory 19 100–142. · Zbl 1032.62011 · doi:10.1017/S0266466603191050
[13] Leeb, H. and Pötscher, B. M. (2003). Can one estimate the unconditional distribution of post-model-selection estimators? Working paper, Dept. Statistics, Univ. Vienna.
[14] Leeb, H. and Pötscher, B. M. (2003). Can one estimate the conditional distribution of post-model-selection estimators? Working paper, Dept. Statistics, Univ. Vienna.
[15] Leeb, H. and Pötscher, B. M. (2005). Model selection and inference: Facts and fiction. Econometric Theory 21 21–59. · Zbl 1085.62004 · doi:10.1017/S0266466605050036
[16] Leeb, H. and Pötscher, B. M. (2006). Performance limits for estimators of the risk or distribution of shrinkage-type estimators, and some general lower risk-bound results. Econometric Theory 22 69–97. · Zbl 1083.62060 · doi:10.1017/S0266466606060038
[17] Lehmann, E. L. and Casella, G. (1998). Theory of Point Estimation , 2nd ed. Springer, New York. · Zbl 0916.62017
[18] Pötscher, B. M. (1991). Effects of model selection on inference. Econometric Theory 7 163–185. JSTOR: · links.jstor.org
[19] Pötscher, B. M. (1995). Comment on “The effect of model selection on confidence regions and prediction regions,” by P. Kabaila. Econometric Theory 11 550–559. JSTOR: · links.jstor.org
[20] Pötscher, B. M. and Novák, A. J. (1998). The distribution of estimators after model selection: Large and small sample results. J. Statist. Comput. Simulation 60 19–56. · Zbl 0960.62031 · doi:10.1080/00949659808811870
[21] Rao, C. R. and Wu, Y. (2001). On model selection (with discussion). In Model Selection (P. Lahiri, ed.) 1–64. IMS, Beachwood, OH. · doi:10.1214/lnms/1215540960
[22] Robinson, G. K. (1979). Conditional properties of statistical procedures. Ann. Statist. 7 742–755. · Zbl 0423.62005 · doi:10.1214/aos/1176344725
[23] Sen, P. K. (1979). Asymptotic properties of maximum likelihood estimators based on conditional specification. Ann. Statist. 7 1019–1033. · Zbl 0413.62020 · doi:10.1214/aos/1176344785
[24] Sen, P. K. and Saleh, A. K. M. E. (1987). On preliminary test and shrinkage \(M\)-estimation in linear models. Ann. Statist. 15 1580–1592. · Zbl 0639.62046 · doi:10.1214/aos/1176350611
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.