Breast cancer relative hazard estimates from case-control and cohort designs with missing data on mammographic density. (English) Zbl 1205.62163

Summary: We analyzed data from the Breast Cancer Detection Demonstration Project (BCDDP) to obtain multivariate relative hazard models for breast cancer that included mammographic density (MD) in addition to standard risk factors. Data from the BCDDP were collected from a stratified case-control study in the screening phase (1973–1980) and from follow-up of three subcohorts in the follow-up phase (1980–1995). For both phases, MD measurements were only available for about half the women who developed breast cancer (cases) and a small fraction of noncases. We used a logistic regression model for the stratified case-control study and developed a general pseudo-likelihood approach to accommodate missing covariate data (MD) by adapting the method of Scott and Wild and Breslow and Holubkov. We showed that this method was substantially more efficient than a previously proposed weighted-likelihood method. We assumed piecewise exponential models for the analysis of each subcohort, with the missing covariate (MD) distribution conditional on the observed information modeled with polytomous logistic regression. We developed an EM algorithm for estimation, which allowed for time-varying covariates, incomplete follow-up, and left truncation. We analyzed the three follow-up subcohorts separately and then combined the relative hazard models from the case-control and cohort data. The final model included main effects for MD, weight, age at first live birth, number of previous breast biopsies, and number of sisters or mother with breast cancer and was more discriminating (higher concordance) than the original model of Gail et al., which included standard risk factors but not MD. In a separate work, we combined this relative hazard model with other data to project absolute breast cancer risk.


62P10 Applications of statistics to biology and medical sciences; meta analysis
Full Text: DOI