×

Regularized quantile regression for ultrahigh-dimensional data with nonignorable missing responses. (English) Zbl 1442.62082

The authors consider the quantile regression \(Y_i=x_i^T\beta (\tau)+e_i(\tau),\;i=1,\ldots,n,\) where \(x_i=(1,X_{i1},X_{i2},\ldots,X_{ip_n})^T\), is the covariate vector, \(Y_i\) are the responses, \(\beta(\tau)=(\beta_0(\tau),\beta_1(\tau),\ldots, \beta_{p_n}(\tau))^T\) is an unknown coefficient vector and \(e_i(\tau)\) is an independent random error variable with \(\tau-\)th quantile \(0\). Here data are supposed to be ultrahigh-dimensional with missing responses and a regularized estimator of \(\beta\) is looked for.
The propensity score is specified by the semiparametric exponential tilting model. The Pearson Chi-square type test statistic is used for identification of the important features in the sparse propensity score model and the adjusted empirical likelihood method is used for estimation of the parameters in the reduced model. With the estimated propensity score model, the authors suggest an inverse probability weighted and penalized objective function for regularized estimation with non convex SCAD penalty and MCP functions. The oracle properties of the proposed regularized estimators are estblished, assuming the propensity score model is of low dimension.
Simulation study and real data analysis are given to examine the finite sample performance of the proposed approaches.

MSC:

62G08 Nonparametric regression and quantile regression
62D10 Missing data
62N05 Reliability and life testing
62P10 Applications of statistics to biology and medical sciences; meta analysis

Software:

QICD
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] An, LTH; Tao, PD, The DC (difference of convex functions) programming and DCA revisited with DC models of real world nonconvex optimization problems, Ann Oper Res, 133, 23-46 (2005) · Zbl 1116.90122 · doi:10.1007/s10479-004-5022-1
[2] Belloni, A.; Chernozhukov, V., L1-penalized quantile regression in high-dimensional sparse models, Ann Stat, 39, 82-130 (2011) · Zbl 1209.62064 · doi:10.1214/10-AOS827
[3] Chang, T.; Kott, PS, Using calibration weighting to adjust for nonresponse under a plausible model, Biometrika, 95, 555-571 (2008) · Zbl 1437.62411 · doi:10.1093/biomet/asn022
[4] Chen, J.; Variyath, AM; Abraham, B., Adjusted empirical likelihood and its properties, J Comput Gr Stat, 17, 426-443 (2008) · doi:10.1198/106186008X321068
[5] Ding, X.; Tang, N., Adjusted empirical likelihood estimation of distribution function and quantile with nonignorable missing data, J Syst Sci Complex, 31, 820-840 (2018) · Zbl 1401.93192 · doi:10.1007/s11424-018-6334-6
[6] Fan, J.; Li, R., Variable selection via nonconcave penalized likelihood and its oracle properties, J Am Stat Assoc, 96, 1348-1360 (2001) · Zbl 1073.62547 · doi:10.1198/016214501753382273
[7] Fan, J.; Fan, Y.; Barut, E., Adaptive robust variable selection, Ann Stat, 42, 324-351 (2014) · Zbl 1296.62144 · doi:10.1214/13-AOS1191
[8] Fan, J.; Li, Q.; Wang, Y., Estimation of high dimensional mean regression in the absence of symmetry and light tail assumptions, J R Stat Soc Ser B, 79, 247-265 (2017) · Zbl 1414.62178 · doi:10.1111/rssb.12166
[9] Fang, F.; Zhao, J.; Shao, J., Imputation-based adjusted score equations in generalized linear models with nonignorable missing covariate values, Stat Sin, 28, 1677-1701 (2018) · Zbl 1406.62080
[10] Gu, Y.; Fan, J.; Kong, L.; Ma, S.; Zou, H., ADMM for high-dimensional sparse penalized quantile regression, Technometrics, 60, 319-331 (2018) · doi:10.1080/00401706.2017.1345703
[11] He, X.; Wang, L.; Hong, HG, Quantile-adaptive model-free variable screening for high-dimensional heterogeneous data, Ann Stat, 41, 342-369 (2013) · Zbl 1295.62053 · doi:10.1214/13-AOS1087
[12] Hong, Z.; Hu, Y.; Lian, H., Variable selection for high-dimensional varying coefficient partially linear models via nonconcave penalty, Metrika, 76, 887-908 (2013) · Zbl 06224852 · doi:10.1007/s00184-012-0422-8
[13] Huang, J.; Ma, S.; Zhang, C., Adaptive lasso for sparse high-dimensional regression, Stat Sin, 18, 1603-1618 (2008) · Zbl 1255.62198
[14] Huang, D.; Li, R.; Wang, H., Feature screening for ultrahigh dimensional categorical data with applications, J Bus Econ Stat, 32, 237-244 (2014) · doi:10.1080/07350015.2013.863158
[15] Jiang, D.; Zhao, P.; Tang, N., A propensity score adjusted method for regression models with nonignorable missing covariates, Comput Stat Data Anal, 94, 98-119 (2016) · Zbl 1468.62091 · doi:10.1016/j.csda.2015.07.017
[16] Kim, JK; Yu, CL, A semiparametric estimation of mean functionals with nonignorable missing data, J Am Stat Assoc, 106, 157-165 (2011) · Zbl 1396.62032 · doi:10.1198/jasa.2011.tm10104
[17] Kim, Y.; Choi, H.; Oh, HS, Smoothly clipped absolute deviation on high dimensions, J Am Stat Assoc, 103, 1665-1673 (2008) · Zbl 1286.62062 · doi:10.1198/016214508000001066
[18] Lai, P.; Liu, Y.; Liu, Z.; Wan, Y., Model free feature screening for ultrahigh dimensional data with responses missing at random, Comput Stat Data Anal, 105, 201-216 (2017) · Zbl 1466.62125 · doi:10.1016/j.csda.2016.08.008
[19] Lee, ER; Noh, H.; Park, BU, Model selection via Bayesian information criterion for quantile regression models, J Am Stat Assoc, 109, 216-229 (2014) · Zbl 1367.62122 · doi:10.1080/01621459.2013.836975
[20] Ni, L.; Fang, F., Entropy-based model-free feature screening for ultrahigh-dimensional multiclass classification, J Nonparametr Stat, 28, 515-530 (2016) · Zbl 1349.62279 · doi:10.1080/10485252.2016.1167206
[21] Ni, L.; Fang, F.; Wan, F., Adjusted Pearson Chi-square feature screening for multi-classification with ultrahigh dimensional data, Metrika, 80, 805-828 (2017) · Zbl 1390.62113 · doi:10.1007/s00184-017-0629-9
[22] Owen, AB, Empirical likelihood (2001), Boca Raton: CRC Press, Boca Raton · Zbl 0989.62019
[23] Peng, B.; Wang, L., An iterative coordinate descent algorithm for high-dimensional nonconvex penalized quantile regression, J Comput Gr Stat, 24, 676-694 (2015) · doi:10.1080/10618600.2014.913516
[24] Qin, J.; Leung, D.; Shao, J., Estimation with survey data under nonignorable nonresponse or informative sampling, J Am Stat Assoc, 97, 193-200 (2002) · Zbl 1073.62513 · doi:10.1198/016214502753479338
[25] Rosenwald, A.; Wright, G.; Chan, WC; Connors, JM; Campo, E.; Fisher, RI, The use of molecular profiling to predict survival after chemotherapy for diffuse large-B-cell lymphoma, N Engl J Med, 346, 1937-1947 (2002) · doi:10.1056/NEJMoa012914
[26] Shao, J.; Wang, L., Semiparametric inverse propensity weighting for nonignorable missing data, Biometrika, 103, 175-187 (2016) · Zbl 1452.62294 · doi:10.1093/biomet/asv071
[27] Sherwood, B., Variable selection for additive partial linear quantile regression with missing covariates, J Multivar Anal, 152, 206-223 (2016) · Zbl 1348.62148 · doi:10.1016/j.jmva.2016.08.009
[28] Tang, N.; Zhao, P.; Zhu, H., Empirical likelihood for estimating equations with nonignorably missing data, Stat Sin, 24, 723-747 (2014) · Zbl 1285.62035
[29] Wang, Q.; Li, Y., How to make model free feature screening approaches for full data applicable to the case of missing response?, Scand J Stat, 45, 324-346 (2018) · Zbl 1405.62020 · doi:10.1111/sjos.12290
[30] Wang, L.; Wu, Y.; Li, R., Quantile regression for analyzing heterogeneity in ultra-high dimension, J Am Stat Assoc, 107, 214-222 (2012) · Zbl 1328.62468 · doi:10.1080/01621459.2012.656014
[31] Wang, S.; Shao, J.; Kim, JK, An instrumental variable approach for identification and estimation with nonignorable nonresponse, Stat Sin, 24, 1097-1116 (2014) · Zbl 06431822
[32] Yu, L.; Lin, N.; Wang, L., A parallel algorithm for large-scale nonconvex penalized quantile regression, J Comput Gr Stat, 26, 935-939 (2017) · doi:10.1080/10618600.2017.1328366
[33] Zhang, C., Nearly unbiased variable selection under minimax concave penalty, Ann Stat, 38, 894-942 (2010) · Zbl 1183.62120 · doi:10.1214/09-AOS729
[34] Zhang, L.; Lin, C.; Zhou, Y., Generalized method of moments for nonignorable missing data, Stat Sin, 28, 2107-2124 (2018) · Zbl 1406.62083
[35] Zhao, J.; Shao, J., Semiparametric pseudo-likelihoods in generalized linear models with nonignorable missing data, J Am Stat Assoc, 110, 1577-1590 (2015) · Zbl 1373.62388 · doi:10.1080/01621459.2014.983234
[36] Zhao, P.; Zhao, H.; Tang, N.; Li, Z., Weighted composite quantile regression analysis for nonignorable missing data using nonresponse instrument, J Nonparametr Stat, 29, 189-212 (2017) · Zbl 1369.62111 · doi:10.1080/10485252.2017.1285030
[37] Zhao, J.; Yang, Y.; Ning, Y., Penalized pairwise pseudo likelihood for variable selection with nonignorable missing data, Stat Sin, 28, 2125-2148 (2018) · Zbl 1406.62078
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.