×

Inferring food intake from multiple biomarkers using a latent variable model. (English) Zbl 1498.62205

Summary: Metabolomic-based approaches have gained much attention in recent years, due to their promising potential to deliver objective tools for assessment of food intake. In particular, multiple biomarkers have emerged for single foods. However, there is a lack of statistical tools available for combining multiple biomarkers to quantitatively infer food intake. Furthermore, there is a paucity of approaches for estimating the uncertainty around biomarker-based inferred intake.
Here, to estimate the relationship between multiple metabolomic biomarkers and food intake in an intervention study conducted under the A-DIET research programme, a latent variable model, multiMarker, is proposed. The multiMarker model integrates factor analytic and mixture of experts models: the observed biomarker values are related to intake which is described as a continuous latent variable which follows a flexible mixture of experts model with Gaussian components. The multiMarker model also facilitates inference on the latent intake when only biomarker data are subsequently observed. A Bayesian hierarchical modelling framework provides flexibility to adapt to different biomarker distributions and facilitates inference of the latent intake along with its associated uncertainty.
Simulation studies are conducted to assess the performance of the multiMarker model, prior to its application to the motivating application of quantifying apple intake.

MSC:

62P10 Applications of statistics to biology and medical sciences; meta analysis
62H25 Factor analysis and principal components; correspondence analysis

Software:

multiMarker; R
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] Agresti, A. (1999). Modelling ordered categorical data: Recent advances and future challenges. Stat. Med. 18 2191-2207.
[2] Baldrick, F. R., Woodside, J. V., Elborn, J. S., Young, I. S. and McKinley, M. C. (2011). Biomarkers of fruit and vegetable intake in human intervention studies: A systematic review. Crit. Rev. Food Sci. Nutr. 51 795-815.
[3] Bartholomew, D. J. and Knott, M. (1999). Latent Variable Models and Factor Analysis, 2nd ed. Kendall’s Library of Statistics 7. Edward Arnold, London; Oxford Univ. Press, New York. · Zbl 1066.62528
[4] Bhattacharya, A. and Dunson, D. B. (2011). Sparse Bayesian infinite factor models. Biometrika 98 291-306. · Zbl 1215.62025 · doi:10.1093/biomet/asr013
[5] Bingham, S. A. (2002). Biomarkers in nutritional epidemiology. Public Health Nutr. 5 821-827.
[6] Blei, D. M., Kucukelbir, A. and McAuliffe, J. D. (2017). Variational inference: A review for statisticians. J. Amer. Statist. Assoc. 112 859-877. · doi:10.1080/01621459.2017.1285773
[7] Cagnone, S. and Viroli, C. (2012). A factor mixture analysis model for multivariate binary data. Stat. Model. 12 257-277. · Zbl 07257879 · doi:10.1177/1471082X1101200303
[8] Chen, Y.-C., Wang, Y. S. and Erosheva, E. A. (2018). On the use of bootstrap with variational inference: Theory, interpretation, and a two-sample test example. Ann. Appl. Stat. 12 846-876. · Zbl 1405.62165 · doi:10.1214/18-AOAS1169
[9] D’Angelo, S., Brennan, L. and Gormley, I. C. (2020). multiMarker: Latent variable model to infer food intake from multiple biomarkers. R package version 1.0.1.
[10] D’Angelo, S., Brennan, L. and Gormley, I. C. (2021). Supplement to “Inferring food intake from multiple biomarkers using a latent variable model.” https://doi.org/10.1214/21-AOAS1478SUPP
[11] Dragsted, L. O. et al. (2018). Validation of biomarkers of food intake—critical assessment of candidate biomarkers. Genes and Nutrition 13.
[12] Frühwirth-Schnatter, S. and Lopes, H. F. (2018). Sparse Bayesian factor analysis when the number of factors is unknown. arXiv:1804.04231.
[13] Galimberti, G., Montanari, A. and Viroli, C. (2009). Penalized factor mixture analysis for variable selection in clustered data. Comput. Statist. Data Anal. 53 4301-4310. · Zbl 1453.62094 · doi:10.1016/j.csda.2009.05.025
[14] Gao, Q. et al. (2017). A scheme for a flexible classification of dietary and health biomarkers. Genes and Nutrition 12.
[15] Garcia-Aloy, M., Rabassa, M., Casas-Agustench, P., Hidalgo-Liberona, N., Llorach, R. and Andres-Lacueva, C. (2017). Novel strategies for improving dietary exposure assessment: Multiple-data fusion is a more accurate measure than the traditional single-biomarker approach. Trends Food Sci. Technol. 69 220-229.
[16] Gormley, I. C. and Frühwirth-Schnatter, S. (2019). Mixture of experts models. In Handbook of Mixture Analysis. Chapman & Hall/CRC Handb. Mod. Stat. Methods 271-307. CRC Press, Boca Raton, FL. · Zbl 1428.62273
[17] Gürdeniz, G. et al. (2016). Detecting beer intake by unique metabolite patterns. J. Proteome Res. 15 4544-4556.
[18] Jacobs, R. A., Jordan, M. I., Nowlan, S. J. and Hinton, G. E. (1991). Adaptive mixtures of local experts. Neural Comput. 3 79-87.
[19] Kipnis, V. et al. (2002). Bias in dietary-report instruments and its implications for nutritional epidemiology. Public Health Nutr. 5 915-923.
[20] Lin, T.-I., McLachlan, G. J. and Lee, S. X. (2016). Extending mixtures of factor models using the restricted multivariate skew-normal distribution. J. Multivariate Anal. 143 398-413. · Zbl 1328.62378 · doi:10.1016/j.jmva.2015.09.025
[21] Lloyd, A. J., Willis, N. D., Wilson, T., Zubair, H., Chambers, E., Garcia-Perez, I., Xie, L., Tailliart, K., Beckmann, M. et al. (2019). Addressing the pitfalls when designing intervention studies to discover and validate biomarkers of habitual dietary intake. Metabolomics 15.
[22] Lopes, H. F. and West, M. (2004). Bayesian model assessment in factor analysis. Statist. Sinica 14 41-67. · Zbl 1035.62060
[23] McLachlan, G. J., Bean, R. W. and Jones, L. B. (2007). Extension of the mixture of factor analyzers model to incorporate the multivariate \(t\)-distribution. Comput. Statist. Data Anal. 51 5327-5338. · Zbl 1445.62053 · doi:10.1016/j.csda.2006.09.015
[24] McNamara, A. E., Collins, C., Sri Harsha, P. S. C., González-Peña, D., Gibbons, H., McNulty, B. A., Nugent, A. P., Walton, J., Flynn, A. et al. (2020). Metabolomic-based approach to identify biomarkers of apple intake. Mol. Nutr. Food Res.
[25] Montanari, A. and Viroli, C. (2010). Heteroscedastic factor mixture analysis. Stat. Model. 10 441-460. · Zbl 07256833 · doi:10.1177/1471082X0901000405
[26] Morgan, B. J. T. and Smith, D. M. (1992). A note on Wadley’s problem with overdispersion. Appl. Stat. 41 287-497. · Zbl 0825.62878
[27] Murphy, K., Viroli, C. and Gormley, I. C. (2020). Infinite mixtures of infinite factor analysers. Bayesian Anal. 15 937-963. · Zbl 1459.62118 · doi:10.1214/19-BA1179
[28] Murray, P. M., Browne, R. P. and McNicholas, P. D. (2014). Mixtures of skew-\(t\) factor analyzers. Comput. Statist. Data Anal. 77 326-335. · Zbl 1506.62132 · doi:10.1016/j.csda.2014.03.012
[29] Muthén, B. O. (1989). Latent variable modeling in heterogeneous populations. Psychometrika 54 557-585. · doi:10.1007/BF02296397
[30] Pison, G., Rousseeuw, P. J., Filzmoser, P. and Croux, C. (2003). Robust factor analysis. J. Multivariate Anal. 84 145-172. · Zbl 1038.62055 · doi:10.1016/S0047-259X(02)00007-6
[31] R Core Team (2020). \(R\): A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria.
[32] Ročková, V. and George, E. I. (2016). Fast Bayesian factor analysis via automatic rotations to sparsity. J. Amer. Statist. Assoc. 111 1608-1622. · doi:10.1080/01621459.2015.1100620
[33] Rothwell, J. A. et al. (2014). New biomarkers of coffee consumption identified by the non-targeted metabolomic profiling of cohort study subjects. PLoS ONE.
[34] Siddique, J., Daniels, M. J., Carroll, R. J., Raghunathan, T. E., Stuart, E. A. and Freedman, L. S. (2019). Measurement error correction and sensitivity analysis in longitudinal dietary intervention studies using an external validation study. Biometrics 75 927-937. · Zbl 1436.62633 · doi:10.1111/biom.13044
[35] Subar, A. F., Kipnis, V., Troiano, R. P., Midthune, D., Schoeller, D. A., Bingham, S., Sharbaugh, C. O., Trabulsi, J., Runswick, S. et al. (2003). Using intake biomarkers to evaluate the extent of dietary misreporting in a large sample of adults: The OPEN study. Am. J. Epidemiol. 158 1-13.
[36] Vázquez-Manjarrez, N. et al. (2019). Discovery and validation of banana intake biomarkers using untargeted metabolomics in human intervention and cross-sectional studies. J. Nutr. 149 1701-1713
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.