zbMATH — the first resource for mathematics

Semiparametric Bayesian analysis of nutritional epidemiology data in the presence of measurement error. (English) Zbl 1192.62092
Summary: We propose a semiparametric Bayesian method for handling measurement error in nutritional epidemiological data. Our goal is to estimate nonparametrically the form of association between a disease and exposure variable while the true values of the exposure are never observed. Motivated by nutritional epidemiological data, we consider the setting where a surrogate covariate is recorded in the primary data, and a calibration data set contains information on the surrogate variable and repeated measurements of an unbiased instrumental variable of the true exposure.
We develop a flexible Bayesian method where not only is the relationship between the disease and exposure variable treated semiparametrically, but also the relationship between the surrogate and the true exposure is modeled semiparametrically. The two nonparametric functions are modeled simultaneously via B-splines. In addition, we model the distribution of the exposure variable as a Dirichlet process mixture of normal distributions, thus making its modeling essentially nonparametric and placing this work into the context of functional measurement error modeling. We apply our method to the NIH-AARP Diet and Health Study and examine its performance in a simulation study.

62F15 Bayesian inference
62G05 Nonparametric estimation
92C50 Medical applications (general)
62P10 Applications of statistics to biology and medical sciences; meta analysis
65C60 Computational problems in statistics (MSC2010)
62N02 Estimation in survival analysis and censored data
Full Text: DOI
[1] Berry, Bayesian smoothing and regression splines for measurement error problems, Journal of the American Statistical Association 97 pp 160– (2002) · Zbl 1073.62524 · doi:10.1198/016214502753479301
[2] Carroll, Optimal rates of convergence for deconvolving a density, Journal of the American Statistical Association 83 pp 1184– (1988) · Zbl 0673.62033 · doi:10.2307/2290153
[3] Carroll, Nonparametric regression and instrumental variables, Journal of the American Statistical Association 99 pp 736– (2004) · Zbl 1117.62306 · doi:10.1198/016214504000001088
[4] Carroll, Measurement Error in Nonlinear Models: A Modern Perspective (2006) · Zbl 1119.62063 · doi:10.1201/9781420010138
[5] Delaigle, Using SIMEX for smoothing parameter choices in errors in variables problems, Journal of the American Statistical Association 103 pp 280– (2008) · Zbl 05564487 · doi:10.1198/016214507000001355
[6] Escobar, Bayesian density estimation and inference using mixtures, Journal of the American Statistical Association 90 pp 577– (1995) · Zbl 0826.62021 · doi:10.2307/2291069
[7] Fan, Nonparametric regression with errors in variables, Annals of Statistics 21 pp 1900– (1993) · Zbl 0791.62042 · doi:10.1214/aos/1176349402
[8] Holmes, Generalized nonlinear modeling with multivariate free-knot regression splines, Journal of the American Statistical Association 98 pp 352– (2003) · Zbl 1041.62059 · doi:10.1198/016214503000143
[9] Johnson, Structured measurement error in nutritional epidemiology: Applications in the pregnancy, infection, and nutrition (PIN) study, Journal of the American Statistical Association 102 pp 856– (2007) · Zbl 05564416 · doi:10.1198/016214506000000771
[10] Kipnis, Empirical evidence of correlated biases in dietary assessment instruments and its implications, American Journal of Epidemiology 153 pp 394– (2001) · doi:10.1093/aje/153.4.394
[11] Kipnis, The structure of dietary measurement error: Results of the OPEN biomarker study, American Journal of Epidemiology 158 pp 14– (2003) · doi:10.1093/aje/kwg091
[12] Leitenstorfer, Generalized monotonic regression based on B-splines with an application to air pollution data, Biostatistics 8 pp 654– (2007) · Zbl 1118.62125 · doi:10.1093/biostatistics/kxl036
[13] Mallick, Semiparametric errors-in-variables models: A Bayesian approach, Journal of Statistical Planning and Inference 52 pp 307– (1996) · Zbl 0848.62035 · doi:10.1016/0378-3758(95)00139-5
[14] McAuliffe, Nonparametric empirical Bayes for the Dirichlet process mixture model, Statistics and Computing 16 pp 5– (2006) · doi:10.1007/s11222-006-5196-2
[15] Müller, A Bayesian semiparametric model for case-control studies with errors in variables, Biometrika 84 pp 523– (1997) · Zbl 0888.62023 · doi:10.1093/biomet/84.3.523
[16] Newton, Approximate Bayesian inference with the weighted likelihood bootstrap, Journal of the Royal Statistical Society, Series B 56 pp 3– (1994) · Zbl 0788.62026
[17] Ruppert, Selecting the number of knots for penalized splines, Journal of Computational and Graphical Statistics 11 pp 735– (2002) · doi:10.1198/106186002853
[18] Ruppert, Semiparametric Regression (2003) · Zbl 1038.62042 · doi:10.1017/CBO9780511755453
[19] Schatzkin, Design and serendipity in establishing a large cohort with wide dietary intake distributions, American Journal of Epidemiology 154 pp 1119– (2001) · doi:10.1093/aje/154.12.1119
[20] Thiébaut, Dietary fat and postmenopausal invasive breast cancer in the National Institutes of Health-AARP diet and health study cohort, Journal of the National Cancer Institute 99 pp 451– (2007) · doi:10.1093/jnci/djk094
[21] Wood, A Bayesian approach to robust binary nonparametric regression, Journal of the American Statistical Association 93 pp 203– (1998) · Zbl 0906.62037 · doi:10.2307/2669617
[22] Wood, Model selection in spline nonparametric regression, Journal of the Royal Statistical Society, Series B 64 pp 119– (2002) · Zbl 1015.62039 · doi:10.1111/1467-9868.00328
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.