zbMATH — the first resource for mathematics

Generalized linear models with linear constraints for microbiome compositional data. (English) Zbl 1436.62596
Summary: Motivated by regression analysis for microbiome compositional data, this article considers generalized linear regression analysis with compositional covariates, where a group of linear constraints on regression coefficients are imposed to account for the compositional nature of the data and to achieve subcompositional coherence. A penalized likelihood estimation procedure using a generalized accelerated proximal gradient method is developed to efficiently estimate the regression coefficients. A de-biased procedure is developed to obtain asymptotically unbiased and normally distributed estimates, which leads to valid confidence intervals of the regression coefficients. Simulations results show the correctness of the coverage probability of the confidence intervals and smaller variances of the estimates when the appropriate linear constraints are imposed. The methods are illustrated by a microbiome study in order to identify bacterial species that are associated with inflammatory bowel disease (IBD) and to predict IBD using fecal microbiome.
62P10 Applications of statistics to biology and medical sciences; meta analysis
62J12 Generalized linear models (logistic models)
Full Text: DOI