×

Nonparametric regression under double-sampling designs. (English) Zbl 1215.62040

Summary: This paper studies nonparametric estimation of the regression function with surrogate outcome data under double-sampling designs, where a proxy response is observed for the full sample and the true response is observed on a validation set. A new estimation approach is proposed for estimating the regression function. The authors first estimate the regression function with a kernel smoother based on the validation subsample, and then improve the estimation by utilizing the information on the incomplete observations from the non-validation subsample and the surrogate of response from the full sample. Asymptotic normality of the proposed estimator is derived. The effectiveness of the proposed method is demonstrated via simulations.

MSC:

62G08 Nonparametric regression and quantile regression
62G20 Asymptotic properties of nonparametric inference
65C60 Computational problems in statistics (MSC2010)
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] J. Wittes, E. Lakatos, and J. Probstfield, Surrogate endpoints in clinical trials: Cardiovascular diseases, Statist. Med., 1989, 8: 415–425. · doi:10.1002/sim.4780080405
[2] J. Neyman, Contribution to the theory of sampling from human populations, J. Amer. Statist. Assoc., 1938, 33: 101–116. · Zbl 0018.22603 · doi:10.1080/01621459.1938.10503378
[3] M. S. Pepe, Inference using surrogate outcome data and a validation sample, Biometrika, 1992, 79: 355–365. · Zbl 0751.62049 · doi:10.1093/biomet/79.2.355
[4] S. R. Lipsitz, N. M. Laird, and D. P. Harrington, Weighted least squares analysis of repeated categorical measurements with outcomes subject to nonresponse, Biometrics, 1994, 50: 11–24. · Zbl 0826.62082 · doi:10.2307/2533193
[5] M. S. Pepe, M. Reilly, and T. R. Fleming, Auxiliary outcome data and the mean score method, J. Statist. Plann. Inference, 1994, 42: 137–160. · Zbl 0806.62090 · doi:10.1016/0378-3758(94)90194-5
[6] J. M. Robins, A. Rotnitzky, and L. P. Zhao, Estimation of regression coefficients when some regressors are not always observed, J. Amer. Statist. Assoc., 1994, 89: 846–866. · Zbl 0815.62043 · doi:10.1080/01621459.1994.10476818
[7] N. E. Breslow and K. C. Cain, Logistic regression for two-stage case-control data, Biometrika, 1998, 75: 11–20. · Zbl 0635.62110 · doi:10.1093/biomet/75.1.11
[8] J. J. Forster and P. W. F. Smith, Model-based inference for categorical survey data subject to non-ignorable non-response, J. R. Statist. Soc. B., 1998, 60: 57–70. · Zbl 0910.62010 · doi:10.1111/1467-9868.00108
[9] N. Chatterjee, Y. H. Chen, and N. E. Breslow, A pseudoscore estimator for regression problems with two-phrase sampling, J. Amer. Statist. Assoc., 2003, 98: 158–168. · Zbl 1047.62031 · doi:10.1198/016214503388619184
[10] R. J. A. Little and D. Rubin, Statistical Analysis with Missing Data, 2nd Ed., John Wiley, New York, 2002. · Zbl 1011.62004
[11] Y. H. Chen and H. Chen, A unified approach to regression analysis under double-sampling designs, J. R. Statist. Soc. B., 2000, 62: 449–460. · Zbl 0963.62062 · doi:10.1111/1467-9868.00243
[12] J. Jiang and H. Zhou, Additive Hazards Regression with Auxiliary Covariates, Biometrika, 2007, 94: 359–369. · Zbl 1132.62091 · doi:10.1093/biomet/asm016
[13] J. Fan and I. Gijbels, Local Polynomial Modelling and Its Applications, Chapman and Hall, London, 1996. · Zbl 0873.62037
[14] J. Fan and I. Gijbels, Data-driven bandwidth selection in local polynomial fitting: Variable bandwidth and spatial adaptation, J. R. Statist. Soc. B., 1995, 57: 371–394. · Zbl 0813.62033
[15] W. Härdle, Applied Nonparametric Regression Analysis, Cambridge University Press, Cambridge, 1990.
[16] J. Jiang and P. Mack, Robust local polynomial regression for dependent data, Statist. Sinica, 2001, 11: 705–722. · Zbl 0978.62080
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.